Re: OpenMeetings 80 and 140 users performance test results

[email protected] Wed, 10 Feb 2021 18:04:32 -0800

I'll retry with low-resources docker and will report back
=> Sure if that helps, anyhow I did run 10 performance tests and they all
came with the same results.


BTW We can add "lightweight SCrypt" into OM source code
=> Yeah or make the N value configurable via applicationContext or
properties

Thanks,
Seb

Sebastian Wagner
Director Arrakeen Solutions, OM-Hosting.com
http://arrakeen-solutions.co.nz/
https://om-hosting.com - Cloud & Server Hosting for HTML5
Video-Conferencing OpenMeetings
<https://www.youracclaim.com/badges/da4e8828-743d-4968-af6f-49033f10d60a/public_url>
<https://www.youracclaim.com/badges/b7e709c6-aa87-4b02-9faf-099038475e36/public_url>


On Thu, 11 Feb 2021 at 14:55, Maxim Solodovnik <[email protected]> wrote:

>
>
> On Thu, 11 Feb 2021 at 02:47, [email protected] <[email protected]>
> wrote:
>
>> Well you saw my results against the actual server.
>> And how the single change of reducing the crypt complexity down to some
>> more reasonable number just reduced the problem to zero.
>>
>> I think it's all relative to the server load.
>>
>> For example: Assuming the server gets 8 times slower
>> 500ms normally => 8 times slower => 2seconds
>> 20ms normally => 8 times slower => 80ms
>>
>> Also processing heavy methods usually are more affected as pure database
>> queries in this scenario.
>>
>
> I share my very simple JS test
> It is for UserWebService.login and not "pure DB query"
> I'll retry with low-resources docker and will report back
>
> (I believe it is important to get reproducible results before we will make
> changes :))
>
> BTW We can add "lightweight SCrypt" into OM source code
> so it will be shipped with every build
> But I would like to set current implementation as default
> Does it make sense?
>
>
>> For example: OpenMeetings/Tomcat gets 8 times slower
>> login method (highly depending on CPU because of Scrypt factor) => gets
>> 20 times slower => 500 x 20 = 10 seconds (which is what we saw in my
>> results)
>> while a simple method that just returns a DB query (there is very low
>> load on the DB/RDS instance) is still very fast: => only gets 2 times
>> slower => 20ms => 40ms (which again is what my results show, other Dao
>> methods are just fine!)
>>
>> So all of this I think is relative. You can't just test methods in
>> isolation. (it is still a good and worthwhile test though). But it only
>> proves they work on their own.
>> Doesn't mean they work/scale the same when you put them together.
>>
>> Thanks
>> Sebastian
>>
>> Sebastian Wagner
>> Director Arrakeen Solutions, OM-Hosting.com
>> http://arrakeen-solutions.co.nz/
>> https://om-hosting.com - Cloud & Server Hosting for HTML5
>> Video-Conferencing OpenMeetings
>>
>> <https://www.youracclaim.com/badges/da4e8828-743d-4968-af6f-49033f10d60a/public_url>
>> <https://www.youracclaim.com/badges/b7e709c6-aa87-4b02-9faf-099038475e36/public_url>
>>
>>
>> On Wed, 10 Feb 2021 at 23:26, Maxim Solodovnik <[email protected]>
>> wrote:
>>
>>> Sebastian,
>>>
>>> can you comment my tests?
>>>
>>> for now I see no effect on adding index
>>> And I see no performance issues with login ...
>>>
>>> On Mon, 8 Feb 2021 at 09:32, Maxim Solodovnik <[email protected]>
>>> wrote:
>>>
>>>> Sebastian,
>>>>
>>>> I'm unable to reproduce any "login timing" issues
>>>> as you can see from my previous email login takes ~0.6 second without
>>>> ant indices
>>>> What am I doing wrong?
>>>>
>>>> On Sat, 6 Feb 2021 at 11:53, Maxim Solodovnik <[email protected]>
>>>> wrote:
>>>>
>>>>> Hello Sebastian,
>>>>>
>>>>> just tried to reproduce your results:
>>>>>
>>>>> var start = Date.now();
>>>>> for (var i = 0; i < 200; ++i) {
>>>>>   var localStart = Date.now();
>>>>>   $.ajax({
>>>>>     method: "GET",
>>>>>     async: false,
>>>>>     url: "services/user/login",
>>>>>     data: {user: 'ui_admin', pass: '12345'},
>>>>>     dataType: "json"
>>>>>   });
>>>>>   console.info("Elapsed single: " + (Date.now() - localStart));
>>>>> }
>>>>> console.info("Elapsed: " + (Date.now() - start));
>>>>>
>>>>> The result is:
>>>>>
>>>>> Elapsed single: 761
>>>>> Elapsed single: 578
>>>>> Elapsed single: 590
>>>>> Elapsed single: 589
>>>>> Elapsed single: 632
>>>>> Elapsed single: 580
>>>>> Elapsed single: 580
>>>>> Elapsed single: 577
>>>>> Elapsed single: 575
>>>>> Elapsed single: 558
>>>>> Elapsed single: 563
>>>>> Elapsed single: 579
>>>>> Elapsed single: 559
>>>>> Elapsed single: 565
>>>>> Elapsed single: 592
>>>>> Elapsed single: 575
>>>>> Elapsed single: 567
>>>>> Elapsed single: 645
>>>>> Elapsed single: 699
>>>>> Elapsed single: 590
>>>>> Elapsed single: 578
>>>>> Elapsed single: 565
>>>>> Elapsed single: 578
>>>>> Elapsed single: 600
>>>>> Elapsed single: 591
>>>>> Elapsed single: 566
>>>>> Elapsed single: 577
>>>>> Elapsed single: 569
>>>>> Elapsed single: 621
>>>>> Elapsed single: 588
>>>>> Elapsed single: 598
>>>>> Elapsed single: 635
>>>>> Elapsed single: 612
>>>>> Elapsed single: 621
>>>>> Elapsed single: 575
>>>>> Elapsed single: 589
>>>>> Elapsed single: 591
>>>>> Elapsed single: 556
>>>>> Elapsed single: 589
>>>>> Elapsed single: 579
>>>>> Elapsed single: 573
>>>>> Elapsed single: 567
>>>>> Elapsed single: 581
>>>>> Elapsed single: 564
>>>>> Elapsed single: 620
>>>>> Elapsed single: 660
>>>>> Elapsed single: 608
>>>>> Elapsed single: 587
>>>>> Elapsed single: 623
>>>>> Elapsed single: 635
>>>>> Elapsed single: 623
>>>>> Elapsed single: 632
>>>>> Elapsed single: 590
>>>>> Elapsed single: 607
>>>>> Elapsed single: 836
>>>>> Elapsed single: 608
>>>>> Elapsed single: 614
>>>>> Elapsed single: 565
>>>>> Elapsed single: 593
>>>>> Elapsed single: 585
>>>>> Elapsed single: 601
>>>>> Elapsed single: 601
>>>>> Elapsed single: 611
>>>>> Elapsed single: 566
>>>>> Elapsed single: 601
>>>>> Elapsed single: 766
>>>>> Elapsed single: 601
>>>>> Elapsed single: 588
>>>>> Elapsed single: 602
>>>>> Elapsed single: 578
>>>>> Elapsed single: 576
>>>>> Elapsed single: 599
>>>>> Elapsed single: 579
>>>>> Elapsed single: 569
>>>>> Elapsed single: 588
>>>>> Elapsed single: 576
>>>>> Elapsed single: 578
>>>>> Elapsed single: 564
>>>>> Elapsed single: 578
>>>>> Elapsed single: 581
>>>>> Elapsed single: 598
>>>>> Elapsed single: 580
>>>>> Elapsed single: 619
>>>>> Elapsed single: 578
>>>>> Elapsed single: 590
>>>>> Elapsed single: 589
>>>>> Elapsed single: 577
>>>>> Elapsed single: 581
>>>>> Elapsed single: 584
>>>>> Elapsed single: 555
>>>>> Elapsed single: 599
>>>>> Elapsed single: 582
>>>>> Elapsed single: 586
>>>>> Elapsed single: 578
>>>>> Elapsed single: 578
>>>>> Elapsed single: 590
>>>>> Elapsed single: 608
>>>>> Elapsed single: 580
>>>>> Elapsed single: 601
>>>>> Elapsed single: 567
>>>>> Elapsed single: 622
>>>>> Elapsed single: 567
>>>>> Elapsed single: 597
>>>>> Elapsed single: 611
>>>>> Elapsed single: 579
>>>>> Elapsed single: 601
>>>>> Elapsed single: 610
>>>>> Elapsed single: 578
>>>>> Elapsed single: 598
>>>>> Elapsed single: 580
>>>>> Elapsed single: 603
>>>>> Elapsed single: 618
>>>>> Elapsed single: 655
>>>>> Elapsed single: 589
>>>>> Elapsed single: 590
>>>>> Elapsed single: 602
>>>>> Elapsed single: 599
>>>>> Elapsed single: 589
>>>>> Elapsed single: 610
>>>>> Elapsed single: 582
>>>>> Elapsed single: 616
>>>>> Elapsed single: 590
>>>>> Elapsed single: 634
>>>>> Elapsed single: 616
>>>>> Elapsed single: 597
>>>>> Elapsed single: 589
>>>>> Elapsed single: 584
>>>>> Elapsed single: 602
>>>>> Elapsed single: 600
>>>>> Elapsed single: 589
>>>>> Elapsed single: 643
>>>>> Elapsed single: 589
>>>>> Elapsed single: 598
>>>>> Elapsed single: 589
>>>>> Elapsed single: 579
>>>>> Elapsed single: 605
>>>>> Elapsed single: 606
>>>>> Elapsed single: 643
>>>>> Elapsed single: 591
>>>>> Elapsed single: 589
>>>>> Elapsed single: 612
>>>>> Elapsed single: 587
>>>>> Elapsed single: 589
>>>>> Elapsed single: 599
>>>>> Elapsed single: 587
>>>>> Elapsed single: 578
>>>>> Elapsed single: 569
>>>>> Elapsed single: 590
>>>>> Elapsed single: 609
>>>>> Elapsed single: 599
>>>>> Elapsed single: 591
>>>>> Elapsed single: 620
>>>>> Elapsed single: 600
>>>>> Elapsed single: 567
>>>>> Elapsed single: 589
>>>>> Elapsed single: 569
>>>>> Elapsed single: 574
>>>>> Elapsed single: 567
>>>>> Elapsed single: 591
>>>>> Elapsed single: 578
>>>>> Elapsed single: 588
>>>>> Elapsed single: 616
>>>>> Elapsed single: 595
>>>>> Elapsed single: 567
>>>>> Elapsed single: 588
>>>>> Elapsed single: 601
>>>>> Elapsed single: 589
>>>>> Elapsed single: 577
>>>>> Elapsed single: 599
>>>>> Elapsed single: 588
>>>>> Elapsed single: 612
>>>>> Elapsed single: 623
>>>>> Elapsed single: 610
>>>>> Elapsed single: 582
>>>>> Elapsed single: 584
>>>>> Elapsed single: 588
>>>>> Elapsed single: 661
>>>>> Elapsed single: 619
>>>>> Elapsed single: 602
>>>>> Elapsed single: 575
>>>>> Elapsed single: 613
>>>>> Elapsed single: 585
>>>>> Elapsed single: 623
>>>>> Elapsed single: 623
>>>>> Elapsed single: 625
>>>>> Elapsed single: 630
>>>>> Elapsed single: 591
>>>>> Elapsed single: 569
>>>>> Elapsed single: 599
>>>>> Elapsed single: 587
>>>>> Elapsed single: 588
>>>>> Elapsed single: 606
>>>>> Elapsed single: 595
>>>>> Elapsed single: 635
>>>>> Elapsed single: 641
>>>>> Elapsed single: 613
>>>>> Elapsed single: 643
>>>>> Elapsed single: 612
>>>>> Elapsed single: 591
>>>>> Elapsed single: 587
>>>>> Elapsed: 119587
>>>>>
>>>>>
>>>>> this is the version from master, no changes
>>>>> approx half a second for login ....
>>>>>
>>>>> On Fri, 5 Feb 2021 at 12:52, [email protected] <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> my password was 100% correct. Cause it a test script I run. There
>>>>>> wasn't a
>>>>>> single error during login.
>>>>>>
>>>>>> And the method got invoked exactly 80 times (or 140 times in the
>>>>>> bigger
>>>>>> test) (you can check my dashboards:))
>>>>>>
>>>>>> You can see the number of calls and the rate in this graph (for the
>>>>>> 140
>>>>>> users test):
>>>>>>
>>>>>> http://54.162.44.21:5080/graph?g0.expr=sum(increase(webapp_metrics_filter_count%7Bpath%3D~%22%2Fopenmeetings%2Fservices%2Fuser%2F.%2B%22%7D%5B1m%5D))&g0.tab=0&g0.stacked=0&g0.range_input=15m&g0.end_input=2021-02-05%2002%3A52%3A00&g0.moment_input=2021-02-05%2002%3A52%3A00
>>>>>>
>>>>>> Will invoking this method 140 times in a 5min period make the CPU
>>>>>> spike to
>>>>>> 95% and lead to 10seconds call length ? :)
>>>>>>
>>>>>> Also like I say, I have the same metric on the UserDao::login => And
>>>>>> it
>>>>>> looks the same.
>>>>>>
>>>>>> I don't think so.
>>>>>>
>>>>>> Thanks
>>>>>> Seb
>>>>>>
>>>>>>
>>>>>> Sebastian Wagner
>>>>>> Director Arrakeen Solutions, OM-Hosting.com
>>>>>> http://arrakeen-solutions.co.nz/
>>>>>> https://om-hosting.com - Cloud & Server Hosting for HTML5
>>>>>> Video-Conferencing OpenMeetings
>>>>>> <
>>>>>> https://www.youracclaim.com/badges/da4e8828-743d-4968-af6f-49033f10d60a/public_url
>>>>>> >
>>>>>> <
>>>>>> https://www.youracclaim.com/badges/b7e709c6-aa87-4b02-9faf-099038475e36/public_url
>>>>>> >
>>>>>>
>>>>>>
>>>>>> On Fri, 5 Feb 2021 at 18:44, Maxim Solodovnik <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>> > Can it be password hash function?
>>>>>> > It should be slow (for security reasons)
>>>>>> > And It should be even slower if wrong password was provided
>>>>>> >
>>>>>> > On Fri, 5 Feb 2021 at 12:42, [email protected] <
>>>>>> [email protected]>
>>>>>> > wrote:
>>>>>> >
>>>>>> > > It is login via API: /services/users/login/
>>>>>> > >
>>>>>> > > But: I've put the metric also on the UserDao::login and the
>>>>>> timings look
>>>>>> > > identical.
>>>>>> > >
>>>>>> > > So it would matter how you login I think.
>>>>>> > >
>>>>>> > > It just seems so strange that this method takes so long while
>>>>>> RoomDao and
>>>>>> > > others don't behave that way.
>>>>>> > >
>>>>>> > > Thanks
>>>>>> > > Seb
>>>>>> > >
>>>>>> > > Sebastian Wagner
>>>>>> > > Director Arrakeen Solutions, OM-Hosting.com
>>>>>> > > http://arrakeen-solutions.co.nz/
>>>>>> > > https://om-hosting.com - Cloud & Server Hosting for HTML5
>>>>>> > > Video-Conferencing OpenMeetings
>>>>>> > > <
>>>>>> > >
>>>>>> >
>>>>>> https://www.youracclaim.com/badges/da4e8828-743d-4968-af6f-49033f10d60a/public_url
>>>>>> > > >
>>>>>> > > <
>>>>>> > >
>>>>>> >
>>>>>> https://www.youracclaim.com/badges/b7e709c6-aa87-4b02-9faf-099038475e36/public_url
>>>>>> > > >
>>>>>> > >
>>>>>> > >
>>>>>> > > On Fri, 5 Feb 2021 at 18:38, Maxim Solodovnik <
>>>>>> [email protected]>
>>>>>> > > wrote:
>>>>>> > >
>>>>>> > > > sorry for top posting
>>>>>> > > >
>>>>>> > > > will re-read more carefully later
>>>>>> > > >
>>>>>> > > > What I would like to do as first step:
>>>>>> > > > 1) add your index to DB
>>>>>> > > > 2) analyze your your login results and write Unit test for login
>>>>>> > > > was it login by hash or login by username/password?
>>>>>> > > >
>>>>>> > > >
>>>>>> > > > On Fri, 5 Feb 2021 at 12:28, [email protected] <
>>>>>> > > [email protected]>
>>>>>> > > > wrote:
>>>>>> > > >
>>>>>> > > > > Hi,
>>>>>> > > > >
>>>>>> > > > > as you noticed I added a branch with performance metrics. Add
>>>>>> did
>>>>>> > rerun
>>>>>> > > > > tests with similar and large user numbers.
>>>>>> > > > >
>>>>>> > > > > Results
>>>>>> > > > >
>>>>>> > > > >    - *80 users test*
>>>>>> > > > >    test:
>>>>>> > > > >
>>>>>> > > >
>>>>>> > >
>>>>>> >
>>>>>> https://cwiki.apache.org/confluence/display/OPENMEETINGS/OpenMeetings+80+users+test
>>>>>> > > > >    <
>>>>>> > > > >
>>>>>> > > >
>>>>>> > >
>>>>>> >
>>>>>> https://cwiki.apache.org/confluence/display/OPENMEETINGS/OpenMeetings+80+users+test
>>>>>> > > > > >
>>>>>> > > > >    - *140 users test*
>>>>>> > > > >
>>>>>> > > > >
>>>>>> > > >
>>>>>> > >
>>>>>> >
>>>>>> https://cwiki.apache.org/confluence/display/OPENMEETINGS/OpenMeetings+140+users+test
>>>>>> > > > >
>>>>>> > > > > Findings
>>>>>> > > > >
>>>>>> > > > > *1 - Added index in address.email*
>>>>>> > > > >
>>>>>> > > > > I actually did the *80 users test* twice. I found that I could
>>>>>> > improve
>>>>>> > > > the
>>>>>> > > > > performance of the login command by adding some index into
>>>>>> > > address.email
>>>>>> > > > > See:
>>>>>> > > > >
>>>>>> > > > >
>>>>>> > > >
>>>>>> > >
>>>>>> >
>>>>>> https://cwiki.apache.org/confluence/display/OPENMEETINGS/OpenMeetings+80+users+test#OpenMeetings80userstest-CPUandmemoryusage
>>>>>> > > > > <
>>>>>> > > > >
>>>>>> > > >
>>>>>> > >
>>>>>> >
>>>>>> https://cwiki.apache.org/confluence/display/OPENMEETINGS/OpenMeetings+80+users+test#OpenMeetings80userstest-CPUandmemoryusage
>>>>>> > > > > >
>>>>>> > > > >
>>>>>> > > > > *2 - Login command performance increasingly bad with number
>>>>>> of users*
>>>>>> > > > >
>>>>>> > > > > While I could improve some of the UserService::login (or
>>>>>> generally
>>>>>> > > > > UserDao::login) with the index, *then I switched from 80
>>>>>> users to 140
>>>>>> > > > > users*
>>>>>> > > > > and rerun the tests: *The login command started to take 10+
>>>>>> seconds*
>>>>>> > > > >
>>>>>> > > > >    - I also used an actual log command to verify my metrics
>>>>>> because
>>>>>> > it
>>>>>> > > > just
>>>>>> > > > >    seemed so strange !
>>>>>> > > > >    - I also checked other methods, they are NOT affected. Or
>>>>>> at least
>>>>>> > > by
>>>>>> > > > a
>>>>>> > > > >    very far margin increase very slightly. It's NOT generally
>>>>>> slow,
>>>>>> > but
>>>>>> > > > > some
>>>>>> > > > >    part of the application is slow!
>>>>>> > > > >    - You can see how the duration increases in length here
>>>>>> during the
>>>>>> > > > test
>>>>>> > > > >    run:
>>>>>> > > > >
>>>>>> > > > >
>>>>>> > > >
>>>>>> > >
>>>>>> >
>>>>>> https://cwiki.apache.org/confluence/display/OPENMEETINGS/OpenMeetings+140+users+test#OpenMeetings140userstest-Loginwebservicecallhits10seconds
>>>>>> > > > >    - And you can see here that its not just the HTTP call
>>>>>> that takes
>>>>>> > > long
>>>>>> > > > >    but the actual UserDao:login command:
>>>>>> > > > >
>>>>>> > > > >
>>>>>> > > >
>>>>>> > >
>>>>>> >
>>>>>> https://cwiki.apache.org/confluence/display/OPENMEETINGS/OpenMeetings+140+users+test#OpenMeetings140userstest-DatabaseUserDaologinmethodisbyfarthelongestrunningoneandhitsplus10seconds
>>>>>> > > > >
>>>>>> > > > > => Now this is kind of puzzling. Other methods don't perform
>>>>>> that
>>>>>> > > > > increasingly bad, its just this login command. Like
>>>>>> RoomDao::method
>>>>>> > do
>>>>>> > > > not
>>>>>> > > > > perform badly. Or not at that scale (by far not).
>>>>>> > > > >
>>>>>> > > > > *Questions:*
>>>>>> > > > >
>>>>>> > > > >    - Is it possible that the login method that fetches the
>>>>>> entire
>>>>>> > user
>>>>>> > > > has
>>>>>> > > > >    too many entities linked? SO that even a simple fetch or
>>>>>> login
>>>>>> > > starts
>>>>>> > > > to
>>>>>> > > > >    take a long time quickly ?
>>>>>> > > > >    - Is it really possible that this "only this method" takes
>>>>>> so long
>>>>>> > > or
>>>>>> > > > am
>>>>>> > > > >    I measuring a general issue because the CPU spike is at
>>>>>> 95% ?
>>>>>> > Like I
>>>>>> > > > > say:
>>>>>> > > > >    RoomDao seems fine, are there other methods I should
>>>>>> measure ?
>>>>>> > > > >
>>>>>> > > > > *3 - Tomcat threads unused or way below max*
>>>>>> > > > >
>>>>>> > > > > As you can see between those two graphs:
>>>>>> > > > >
>>>>>> > > > >    - 80 users test active threads:
>>>>>> > > > >
>>>>>> > > > >
>>>>>> > > >
>>>>>> > >
>>>>>> >
>>>>>> https://cwiki.apache.org/confluence/display/OPENMEETINGS/OpenMeetings+80+users+test#OpenMeetings80userstest-Tomcatactivethreads
>>>>>> > > > >    - 140 users test active threads:
>>>>>> > > > >
>>>>>> > > > >
>>>>>> > > >
>>>>>> > >
>>>>>> >
>>>>>> https://cwiki.apache.org/confluence/display/OPENMEETINGS/OpenMeetings+140+users+test#OpenMeetings140userstest-Tomcatactivethreads
>>>>>> > > > >
>>>>>> > > > > => 140 users utilize more threads. YES. But its WAY below the
>>>>>> 400
>>>>>> > > > > available. Not even closely to that number. I don't think its
>>>>>> a
>>>>>> > > threading
>>>>>> > > > > issue. It also makes sense. Cause given the test scenario you
>>>>>> > probably
>>>>>> > > > have
>>>>>> > > > > around 20-40 users hitting the server trying to login and
>>>>>> enter the
>>>>>> > > room
>>>>>> > > > at
>>>>>> > > > > the same time.
>>>>>> > > > >
>>>>>> > > > > *4 - RoomPanel:onInitialise and RoomPanel:enterRoom*
>>>>>> > > > >
>>>>>> > > > > I put some metrics on those methods, you can see the results
>>>>>> of the
>>>>>> > 140
>>>>>> > > > > users test run here:
>>>>>> > > > >
>>>>>> > > > >
>>>>>> > > >
>>>>>> > >
>>>>>> >
>>>>>> http://54.162.44.21:5080/graph?g0.expr=rate(org_openmeetings_metrics_sum%7Btype%3D%22application%22%2Cclass%3D%22RoomPanel%22%7D%5B1m%5D)%0A%2F%0A%20%20rate(org_openmeetings_metrics_count%7Btype%3D%22application%22%2Cclass%3D%22RoomPanel%22%7D%5B1m%5D)&g0.tab=0&g0.stacked=0&g0.range_input=15m&g0.end_input=2021-02-05%2002%3A52%3A00&g0.moment_input=2021-02-05%2002%3A52%3A00
>>>>>> > > > >  => It is definitely getting worse BUT this is MILLISECONDS.
>>>>>> It
>>>>>> > > increased
>>>>>> > > > > 0.05seconds to 0.3seconds. That's getting slower but not that
>>>>>> bad.
>>>>>> > > > >
>>>>>> > > > > If you compare the same graph with the login command:
>>>>>> > > > >
>>>>>> > > > >
>>>>>> > > >
>>>>>> > >
>>>>>> >
>>>>>> http://54.162.44.21:5080/graph?g0.expr=rate(webapp_metrics_filter_sum%7Bpath%3D~%22%2Fopenmeetings%2Fservices%2Fuser%2F.%2B%22%7D%5B1m%5D)%0A%2F%0A%20%20rate(webapp_metrics_filter_count%7Bpath%3D~%22%2Fopenmeetings%2Fservices%2Fuser%2F.%2B%22%7D%5B1m%5D)&g0.tab=0&g0.stacked=0&g0.range_input=15m&g0.end_input=2021-02-05%2002%3A52%3A00&g0.moment_input=2021-02-05%2002%3A52%3A00
>>>>>> > > > >  => That is going from ~800ms to 12seconds!
>>>>>> > > > >
>>>>>> > > > > It shows some similarity in the curve. But the durations are
>>>>>> very
>>>>>> > > > > different!
>>>>>> > > > >
>>>>>> > > > > *5 - Missing video pods massively increased from 80 to 140
>>>>>> users*
>>>>>> > > > >
>>>>>> > > > > During that time when login is slow the amount of conference
>>>>>> rooms
>>>>>> > > where
>>>>>> > > > > video pods are missing is massively increased.
>>>>>> > > > >
>>>>>> > > > > That could be because the OpenMeetings Server is generally
>>>>>> slow.
>>>>>> > Could
>>>>>> > > be
>>>>>> > > > > because something is particularly slow within fetching the
>>>>>> user
>>>>>> > entity
>>>>>> > > or
>>>>>> > > > > login command.
>>>>>> > > > >
>>>>>> > > > > *6 - Missing video always issue on sender side*
>>>>>> > > > >
>>>>>> > > > > From my tests I could see that - if a video is missing that is
>>>>>> > because
>>>>>> > > a
>>>>>> > > > > SENDER has:
>>>>>> > > > > A) Not started the video
>>>>>> > > > > B) Stuck in the video being published
>>>>>> > > > >
>>>>>> > > > > Like I say, the case of (a) it MASSIVELY increased once the
>>>>>> server
>>>>>> > had
>>>>>> > > > 140
>>>>>> > > > > users. But I think it's valuable information to know that
>>>>>> there seem
>>>>>> > to
>>>>>> > > > be
>>>>>> > > > > TWO issues. One related with video pods not starting up, the
>>>>>> other
>>>>>> > > around
>>>>>> > > > > once the video pod started, the video stream doesn't somehow
>>>>>> get
>>>>>> > > > triggered.
>>>>>> > > > >
>>>>>> > > > >
>>>>>> > > > > Generally - digging through above graphs and reports
>>>>>> > > > >
>>>>>> > > > > All above metrics and test runs you can dig through yourself.
>>>>>> I will
>>>>>> > > > > obviously delete those dashboards shortly again. But you can
>>>>>> have a
>>>>>> > dig
>>>>>> > > > > through it:
>>>>>> > > > >
>>>>>> > > > >    - Dashboard for v5.1.0 with additional indexes on
>>>>>> Address.email
>>>>>> > > > >    <
>>>>>> > > > >
>>>>>> > > >
>>>>>> > >
>>>>>> >
>>>>>> http://54.162.44.21:5080/graph?g0.expr=rate(webapp_metrics_filter_sum%7Bpath%3D~%22%2Fopenmeetings%2Fservices%2F.%2B%22%7D%5B1m%5D)%0A%2F%0A%20%20rate(webapp_metrics_filter_count%7Bpath%3D~%22%2Fopenmeetings%2Fservices%2F.%2B%22%7D%5B1m%5D)&g0.tab=0&g0.stacked=0&g0.range_input=15m&g0.end_input=2021-02-05%2000%3A30%3A00&g0.moment_input=2021-02-05%2000%3A30%3A00&g1.expr=rate(org_openmeetings_metrics_sum%7Btype%3D%22application%22%7D%5B1m%5D)%0A%2F%0A%20%20rate(org_openmeetings_metrics_count%7Btype%3D%22application%22%7D%5B1m%5D)&g1.tab=0&g1.stacked=0&g1.range_input=15m&g1.end_input=2021-02-05%2000%3A30%3A00&g1.moment_input=2021-02-05%2000%3A30%3A00&g2.expr=rate(org_openmeetings_metrics_sum%7Btype%3D%22database%22%7D%5B1m%5D)%0A%2F%0A%20%20rate(org_openmeetings_metrics_count%7Btype%3D%22database%22%7D%5B1m%5D)&g2.tab=0&g2.stacked=0&g2.range_input=15m&g2.end_input=2021-02-05%2000%3A30%3A00&g2.moment_input=2021-02-05%2000%3A30%3A00&g3.expr=tomcat_threads_active_total&g3.tab=0&g3.stacked=0&g3.range_input=15m&g3.end_input=2021-02-05%2000%3A30%3A00&g3.moment_input=2021-02-05%2000%3A30%3A00
>>>>>> > > > > >
>>>>>> > > > >    - Test with 140 user at 2021-02-05 02:52:00
>>>>>> > > > >    <
>>>>>> > > > >
>>>>>> > > >
>>>>>> > >
>>>>>> >
>>>>>> http://54.162.44.21:5080/graph?g0.expr=rate(webapp_metrics_filter_sum%7Bpath%3D~%22%2Fopenmeetings%2Fservices%2F.%2B%22%7D%5B1m%5D)%0A%2F%0A%20%20rate(webapp_metrics_filter_count%7Bpath%3D~%22%2Fopenmeetings%2Fservices%2F.%2B%22%7D%5B1m%5D)&g0.tab=0&g0.stacked=0&g0.range_input=15m&g0.end_input=2021-02-05%2002%3A52%3A00&g0.moment_input=2021-02-05%2002%3A52%3A00&g1.expr=rate(org_openmeetings_metrics_sum%7Btype%3D%22application%22%7D%5B1m%5D)%0A%2F%0A%20%20rate(org_openmeetings_metrics_count%7Btype%3D%22application%22%7D%5B1m%5D)&g1.tab=0&g1.stacked=0&g1.range_input=15m&g1.end_input=2021-02-05%2002%3A52%3A00&g1.moment_input=2021-02-05%2002%3A52%3A00&g2.expr=rate(org_openmeetings_metrics_sum%7Btype%3D%22database%22%7D%5B1m%5D)%0A%2F%0A%20%20rate(org_openmeetings_metrics_count%7Btype%3D%22database%22%7D%5B1m%5D)&g2.tab=0&g2.stacked=0&g2.range_input=15m&g2.end_input=2021-02-05%2002%3A52%3A00&g2.moment_input=2021-02-05%2002%3A52%3A00&g3.expr=tomcat_threads_active_total&g3.tab=0&g3.stacked=0&g3.range_input=15m&g3.end_input=2021-02-05%2002%3A52%3A00&g3.moment_input=2021-02-05%2002%3A52%3A00
>>>>>> > > > > >
>>>>>> > > > >
>>>>>> > > > > If you go to those dashboards it will help probably a lot to
>>>>>> ready if
>>>>>> > > you
>>>>>> > > > > have a look at:
>>>>>> > > > >
>>>>>> > > > >
>>>>>> > > >
>>>>>> > >
>>>>>> >
>>>>>> https://cwiki.apache.org/confluence/display/OPENMEETINGS/Prometheus+Logging+and+Metrics
>>>>>> > > > >
>>>>>> > > > > *What else ????*
>>>>>> > > > >
>>>>>> > > > >
>>>>>> > > > >    - What are we thinking on the above findings ? Is login
>>>>>> method a
>>>>>> > > > >    performance issue or just a symptom of a generally slow
>>>>>> server ?
>>>>>> > It
>>>>>> > > > > doesn't
>>>>>> > > > >    really look like a general issue. The difference between
>>>>>> > > UserDao:login
>>>>>> > > > > and
>>>>>> > > > >    EVERYTHING else seems just too drastic
>>>>>> > > > >    - What other methods and calls should I try to trace and
>>>>>> measure ?
>>>>>> > > > >    - Is there any event within initialising a video pod that
>>>>>> I could
>>>>>> > > > >    measure and metric ?
>>>>>> > > > >
>>>>>> > > > > Any other ideas ?
>>>>>> > > > >
>>>>>> > > > > Thanks
>>>>>> > > > > Sebastian
>>>>>> > > > >
>>>>>> > > > > Sebastian Wagner
>>>>>> > > > > Director Arrakeen Solutions, OM-Hosting.com
>>>>>> > > > > http://arrakeen-solutions.co.nz/
>>>>>> > > > > https://om-hosting.com - Cloud & Server Hosting for HTML5
>>>>>> > > > > Video-Conferencing OpenMeetings
>>>>>> > > > > <
>>>>>> > > > >
>>>>>> > > >
>>>>>> > >
>>>>>> >
>>>>>> https://www.youracclaim.com/badges/da4e8828-743d-4968-af6f-49033f10d60a/public_url
>>>>>> > > > > >
>>>>>> > > > > <
>>>>>> > > > >
>>>>>> > > >
>>>>>> > >
>>>>>> >
>>>>>> https://www.youracclaim.com/badges/b7e709c6-aa87-4b02-9faf-099038475e36/public_url
>>>>>> > > > > >
>>>>>> > > > >
>>>>>> > > >
>>>>>> > > >
>>>>>> > > > --
>>>>>> > > > Best regards,
>>>>>> > > > Maxim
>>>>>> > > >
>>>>>> > >
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> > Best regards,
>>>>>> > Maxim
>>>>>> >
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best regards,
>>>>> Maxim
>>>>>
>>>>
>>>>
>>>> --
>>>> Best regards,
>>>> Maxim
>>>>
>>>
>>>
>>> --
>>> Best regards,
>>> Maxim
>>>
>>
>
> --
> Best regards,
> Maxim
>

Re: OpenMeetings 80 and 140 users performance test results

Reply via email to