Re: Performance & Load Testing Results and next steps - Improving OpenMeetings sign in and room enter performance

[email protected] Mon, 01 Feb 2021 19:34:09 -0800

API call sequence is:

UserService>login
RoomService>getExternal
UserService>getRoomHash
>Browser load URL $myURL?secureHash=XYZ (which in turn will trigger another
LOT of new internal calls in Openmeetings)
=> Crashes OpenMeetings Tomcat with 50-60 users entering within 5min the
same server instance.


You can also crash the server (or make it significant slow) with JUST:
Browser load URL $myURL?secureHash=XYZ (assuming hash is pre-existing)
=> With ~100 users entering you can start seeing degradation of performance
of Openmeetings Tomcat instance:
 - Video pods disappearing
 - Slow response times.
It would be interesting to find out how long it would take to crash in this
scenario but I would think ~150 users potentially.

Thanks
Sebastian

Sebastian Wagner
Director Arrakeen Solutions, OM-Hosting.com
http://arrakeen-solutions.co.nz/
https://om-hosting.com - Cloud & Server Hosting for HTML5
Video-Conferencing OpenMeetings
<https://www.youracclaim.com/badges/da4e8828-743d-4968-af6f-49033f10d60a/public_url>
<https://www.youracclaim.com/badges/b7e709c6-aa87-4b02-9faf-099038475e36/public_url>


On Tue, 2 Feb 2021 at 15:55, Maxim Solodovnik <[email protected]> wrote:

> On Tue, 2 Feb 2021 at 07:23, [email protected] <[email protected]>
> wrote:
>
> > Hi,
> >
> > I have been conducting a few more performance and load tests with the
> goal
> > of increasing participants to 100++.
> >
> > The challenge is:
> > *If more then 50-60 users dynamically create a room Hash (using Soap/Rest
> > API) and use that Hash to enter the conference room CPU and memory spikes
> > and server crashes*
> >
>
> Can you share API call sequence?
> Maybe we can write JMeter scenario for this?
>
> server crash is something bad
> What is happening? Is it a JVM crash? Or is the system low of resources and
> the kernel kills the trouble-maker?
>
>
> > *Test scenario observations:*
> >  - It does not matter if those users try to enter the same room or
> separate
> > rooms. In the above test scenario it's a mix of 4x4 conference rooms and
> > 20x1 webinars
> >  - This can be reproduced stable and repetitively
> >  - The issue starts with API calls taking 10sec++ and getting more
> slower.
> > Until the OpenMeetings Tomcat instance crashes
> >  - The issue also manifests that -BEFORE- the server crashes you can see
> > video pods not completing the initialisation in the conference room
> itself.
> > For example missing video pods or video pods without a webcam stream.
> > Likely to be linked to slow running API or web-socket calls
> > => I can deliver data samples or screenshots if required via our
> confluence
> > space.
> >
> > *Hardware and software:*
> >  - Server and OpenMeetings Instance is isolated on a separated hardware
> and
> > has 2GB of memory allocated
> >  - There is no spike on KMS or Database hardware/CPU/memory. The spike is
> > only in the OpenMeetings Tomcat Server instance
> >
> > *Possible ways to mitigate without code changes:*
> >  - You can mitigate part of this issue if you spread the users to enter
> > over a longer time period. However it needs more than 10min separation to
> > enter without issues for 50-60 participants
> >  - You can mitigate part of this issue if you for example create the
> > room-hash in a different process (like 1h before using) and once all
> hashes
> > are created you enter the conference room. It still leads to issues, but
> > you can enter up to 100 users within 5-10min, if you just use the links,
> > rather than create the link AND entering with the link at the same
> > time/process
> >  - Increasing Tomcat to more than 2GB of memory per Tomcat instance may
> > help, not sure by how much though
> >
> >  => I think we should spend further time and propose ways to get rid of
> > those spikes. The mitigations are not realistic to really be able to use
> in
> > practise.
> >
> > *My proposal is:*
> > There is further analysis needed:
> >  - Capture all OpenMeetings calls that happen during the create room hash
> > and conference room-enter
> >  - Measure call lengths and any calls during the create room hash and
> > conference room-enter and specific CPU spikes or memory usage based on a
> > per call basis
> >  - Eventually get a stack trace or have a profile available that exports
> > the current in memory objects to review where and what create those
> spikes
> >
> > Once a per-call analysis is there it should be a lot more easy to
> pinpoint
> > specific issues and propose improvements.
> >
> > As with all performance optimisation this is likely to need more
> discussion
> > once more detailed data is available.
> >
> > Thanks,
> > Sebastian
> >
> > Sebastian Wagner
> > Director Arrakeen Solutions, OM-Hosting.com
> > http://arrakeen-solutions.co.nz/
> > https://om-hosting.com - Cloud & Server Hosting for HTML5
> > Video-Conferencing OpenMeetings
> > <
> >
> https://www.youracclaim.com/badges/da4e8828-743d-4968-af6f-49033f10d60a/public_url
> > >
> > <
> >
> https://www.youracclaim.com/badges/b7e709c6-aa87-4b02-9faf-099038475e36/public_url
> > >
> >
>
>
> --
> Best regards,
> Maxim
>

Re: Performance & Load Testing Results and next steps - Improving OpenMeetings sign in and room enter performance

Reply via email to