Sorry but access log just counts calls. It doesn't contain call length. Which is the thing interesting. Also as you mentioned: We have a mix of HTTP and websocket calls. Web-Socket doesn't show up in access logs.
It's just not useful. You can find out some very basic metrics. But not good enough for diagnosing. Thanks Sebastian Sebastian Wagner Director Arrakeen Solutions, OM-Hosting.com http://arrakeen-solutions.co.nz/ https://om-hosting.com - Cloud & Server Hosting for HTML5 Video-Conferencing OpenMeetings <https://www.youracclaim.com/badges/da4e8828-743d-4968-af6f-49033f10d60a/public_url> <https://www.youracclaim.com/badges/b7e709c6-aa87-4b02-9faf-099038475e36/public_url> On Tue, 2 Feb 2021 at 23:16, Maxim Solodovnik <[email protected]> wrote: > tomcat has accesslog valve > it should be enabled by default > > On Tue, 2 Feb 2021 at 16:49, [email protected] <[email protected]> > wrote: > > > I'm not sure. Like I say: I stagger room entry by 5-10min. So it's not > > really a DDoS surge of users. It's a steady but slow growth. > > > > You might be right, there could be such an issue somewhere. Just quite > > difficult to find without call statistics rights now. > > > > Have we looked into enabling some basic performance logs? > > If you just have all API/WebSocket invocations logged (configurable) for > > performance analytics, you can find those kind of issues very easy. > > Capture them and sort by call length, call numbers, sort by top ten => > And > > you get very quickly to a result. > > > > Some of those performance logging frameworks are very easy to enable. You > > can just annotate methods in Java code. And depending on log settings it > > will then print those statistics to the log file. > > Even for example into a format that can be further ingested into > Prometheus > > for performance monitoring and graphing of results. Or for example in > case > > of Prometheus generate a HTTP endpoint that exposes the metrics for > > generating statistics. > > > > See: > > > > - https://github.com/prometheus/client_java > > - > > > > > https://github.com/prometheus/client_java/blob/master/simpleclient_spring_web/src/main/java/io/prometheus/client/spring/web/PrometheusTimeMethod.java > > - > > > > > https://prometheus.github.io/client_java/io/prometheus/client/spring/web/PrometheusTimeMethod.html > > > > There might be other alternatives to Prometheus. But it is the current > tool > > most widely supported and it seems with a lot of SDKs, examples and > > support. If we would have such tools available now I think it would be > > quite easy to pinpoint the bottlenecks. Doesn't need any JProfiler or > > Yourkit. Those are useful but the setup is a bit harder and you > constantly > > end up enabling/disabling the profiling. > > > > Thanks, > > Sebastian > > > > Sebastian Wagner > > Director Arrakeen Solutions, OM-Hosting.com > > http://arrakeen-solutions.co.nz/ > > https://om-hosting.com - Cloud & Server Hosting for HTML5 > > Video-Conferencing OpenMeetings > > < > > > https://www.youracclaim.com/badges/da4e8828-743d-4968-af6f-49033f10d60a/public_url > > > > > < > > > https://www.youracclaim.com/badges/b7e709c6-aa87-4b02-9faf-099038475e36/public_url > > > > > > > > > On Tue, 2 Feb 2021 at 22:31, Maxim Solodovnik <[email protected]> > > wrote: > > > > > Previous time I saw such many-users-same-time issues > > > it was because of too many Ajax requests in room > > > I have moved lot's of them to WS messages and things get better > > > > > > Wicket Ajax requests blocks all pages, maybe further improvements are > > > required > > > > > > > > > On Tue, 2 Feb 2021 at 16:27, Maxim Solodovnik <[email protected]> > > > wrote: > > > > > > > fair enough :) > > > > > > > > On Tue, 2 Feb 2021 at 16:26, [email protected] < > > > [email protected]> > > > > wrote: > > > > > > > >> I think adding cores at some point will be good. > > > >> But we need to get to some reasonable user numbers on a single > > > >> core/reasonable memory. > > > >> Once those numbers are good => Scale it up. > > > >> > > > >> I have a try with the threads and report back. > > > >> > > > >> Thanks, > > > >> Seb > > > >> > > > >> Sebastian Wagner > > > >> Director Arrakeen Solutions, OM-Hosting.com > > > >> http://arrakeen-solutions.co.nz/ > > > >> https://om-hosting.com - Cloud & Server Hosting for HTML5 > > > >> Video-Conferencing OpenMeetings > > > >> < > > > >> > > > > > > https://www.youracclaim.com/badges/da4e8828-743d-4968-af6f-49033f10d60a/public_url > > > >> > > > > >> < > > > >> > > > > > > https://www.youracclaim.com/badges/b7e709c6-aa87-4b02-9faf-099038475e36/public_url > > > >> > > > > >> > > > >> > > > >> On Tue, 2 Feb 2021 at 22:23, Maxim Solodovnik <[email protected] > > > > > >> wrote: > > > >> > > > >> > OK > > > >> > no cores if it is expensive > > > >> > > > > >> > just thought multithreaded application can benefit from multiple > > cores > > > >> :) > > > >> > > > > >> > On Tue, 2 Feb 2021 at 16:21, [email protected] < > > > >> [email protected]> > > > >> > wrote: > > > >> > > > > >> > > I don't really want to add more cores. The docker container has > > > >> exactly 1 > > > >> > > core just for OpenMeetings. And 4GB memory. > > > >> > > > > > >> > > We can try with 2 cores. But the price tags on those > improvements > > > are > > > >> > > getting into a range of not viable options. Except you improve > the > > > >> > > performance by a factor of 10. > > > >> > > > > > >> > > Thanks > > > >> > > Seb > > > >> > > > > > >> > > Sebastian Wagner > > > >> > > Director Arrakeen Solutions, OM-Hosting.com > > > >> > > http://arrakeen-solutions.co.nz/ > > > >> > > https://om-hosting.com - Cloud & Server Hosting for HTML5 > > > >> > > Video-Conferencing OpenMeetings > > > >> > > < > > > >> > > > > > >> > > > > >> > > > > > > https://www.youracclaim.com/badges/da4e8828-743d-4968-af6f-49033f10d60a/public_url > > > >> > > > > > > >> > > < > > > >> > > > > > >> > > > > >> > > > > > > https://www.youracclaim.com/badges/b7e709c6-aa87-4b02-9faf-099038475e36/public_url > > > >> > > > > > > >> > > > > > >> > > > > > >> > > On Tue, 2 Feb 2021 at 22:13, Maxim Solodovnik < > > [email protected] > > > > > > > >> > > wrote: > > > >> > > > > > >> > > > Maybe you can add one more core to OM > > > >> > > > how many do you have right now? > > > >> > > > > > > >> > > > On Tue, 2 Feb 2021 at 16:11, [email protected] < > > > >> > > [email protected]> > > > >> > > > wrote: > > > >> > > > > > > >> > > > > I will have a look with 300 and repeat it. > > > >> > > > > > > > >> > > > > > > > >> > > > > BTW are you using dockerized OM? how are you passing `xmx` > via > > > >> > > > > CATALINA_OPTS > > > >> > > > > ? > > > >> > > > > => I have a custom Openmeetings docker container and I set > > those > > > >> via > > > >> > > > > CATALINA_OPS that are passed into the OpenMeetings instance. > > > >> > > > > I can see in the cataline.out logs that it reads the values > in > > > and > > > >> > uses > > > >> > > > it. > > > >> > > > > > > > >> > > > > Are you setting additional memory for docker? > > > >> > > > > => The Docker container itself also has 4GB memory > available. > > > >> > > > > > > > >> > > > > If you compare the graphs from the 2GB and 4GB test you can > > see > > > >> that > > > >> > > > memory > > > >> > > > > usage in % has dropped by exactly 50%. So it seems pretty > > > >> convincing > > > >> > > that > > > >> > > > > those settings are all correctly applied. > > > >> > > > > > > > >> > > > > Thanks > > > >> > > > > Seb > > > >> > > > > > > > >> > > > > Sebastian Wagner > > > >> > > > > Director Arrakeen Solutions, OM-Hosting.com > > > >> > > > > http://arrakeen-solutions.co.nz/ > > > >> > > > > https://om-hosting.com - Cloud & Server Hosting for HTML5 > > > >> > > > > Video-Conferencing OpenMeetings > > > >> > > > > < > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > > > https://www.youracclaim.com/badges/da4e8828-743d-4968-af6f-49033f10d60a/public_url > > > >> > > > > > > > > >> > > > > < > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > > > https://www.youracclaim.com/badges/b7e709c6-aa87-4b02-9faf-099038475e36/public_url > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > On Tue, 2 Feb 2021 at 22:04, Maxim Solodovnik < > > > >> [email protected]> > > > >> > > > > wrote: > > > >> > > > > > > > >> > > > > > the default is 150 > > > >> > > > > > could you set to 300? > > > >> > > > > > we will see is there will be improvement > > > >> > > > > > > > > >> > > > > > BTW are you using dockerized OM? how are you passing `xmx` > > via > > > >> > > > > > CATALINA_OPTS > > > >> > > > > > ? > > > >> > > > > > Are you setting additional memory for docker? > > > >> > > > > > > > > >> > > > > > On Tue, 2 Feb 2021 at 16:00, [email protected] < > > > >> > > > > [email protected]> > > > >> > > > > > wrote: > > > >> > > > > > > > > >> > > > > > > I can try and re-run, how many would you recommend worth > > > >> trying > > > >> > for > > > >> > > > > this > > > >> > > > > > > scenario ? > > > >> > > > > > > > > > >> > > > > > > Thanks > > > >> > > > > > > Seb > > > >> > > > > > > > > > >> > > > > > > Sebastian Wagner > > > >> > > > > > > Director Arrakeen Solutions, OM-Hosting.com > > > >> > > > > > > http://arrakeen-solutions.co.nz/ > > > >> > > > > > > https://om-hosting.com - Cloud & Server Hosting for > HTML5 > > > >> > > > > > > Video-Conferencing OpenMeetings > > > >> > > > > > > < > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > > > https://www.youracclaim.com/badges/da4e8828-743d-4968-af6f-49033f10d60a/public_url > > > >> > > > > > > > > > > >> > > > > > > < > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > > > https://www.youracclaim.com/badges/b7e709c6-aa87-4b02-9faf-099038475e36/public_url > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > On Tue, 2 Feb 2021 at 21:56, Maxim Solodovnik < > > > >> > > [email protected]> > > > >> > > > > > > wrote: > > > >> > > > > > > > > > >> > > > > > > > Have you tried to increase maxThreads for Tomcat? > > > >> > > > > > > > > > > >> > > > > > > > On Tue, 2 Feb 2021 at 15:26, [email protected] < > > > >> > > > > > > [email protected]> > > > >> > > > > > > > wrote: > > > >> > > > > > > > > > > >> > > > > > > > > I doubled it to 4GB OpenMeetings and 4GB KMS. I > > updated > > > >> the > > > >> > > > docker > > > >> > > > > > > > instance > > > >> > > > > > > > > to run Openmeetings with xms=2GB and Xmx=4GB. > > > >> > > > > > > > > > > > >> > > > > > > > > And I did run exactly the same test again: > > > >> > > > > > > > > - 50-60 users > > > >> > > > > > > > > - staggered to enter in a time period around > 5-10min > > > >> > > > > > > > > - distributed into 10 conference rooms 4x4 and 2 > > > webinars > > > >> > with > > > >> > > > 20 > > > >> > > > > > > users > > > >> > > > > > > > > each > > > >> > > > > > > > > - each test runs calls the API to > > login/createRoomHash > > > >> and > > > >> > > then > > > >> > > > > load > > > >> > > > > > > the > > > >> > > > > > > > > URL with the room (plus start webcam/audio stream in > > the > > > >> > > > conference > > > >> > > > > > > > rooms) > > > >> > > > > > > > > > > > >> > > > > > > > > The results look almost the same. There is hardly > any > > > >> > > > improvement: > > > >> > > > > > > > > > > > >> > > > > > > > > - CPU still spikes to almost 100%, memory is not > a > > > >> problem > > > >> > > > > > > > > - Empty video pods as well as video pods where > > webcam > > > >> > stream > > > >> > > > > > didn't > > > >> > > > > > > > > start > > > >> > > > > > > > > > > > >> > > > > > > > > There isn't a crash, but that is mostly because I > > > stagger > > > >> it > > > >> > to > > > >> > > > > enter > > > >> > > > > > > the > > > >> > > > > > > > > server over a 5-10min period. Which didn't crash the > > 2GB > > > >> > > instance > > > >> > > > > > > either. > > > >> > > > > > > > > > > > >> > > > > > > > > Comparison of the CPU graphs of both hardware > > > >> configuration > > > >> > and > > > >> > > > > test > > > >> > > > > > > > runs: > > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > > > https://cwiki.apache.org/confluence/display/OPENMEETINGS/Performance+Testing#PerformanceTesting-ClusterPerformancetestresult02-022021 > > > >> > > > > > > > > > > > >> > > > > > > > > There is pretty much no improvement. > > > >> > > > > > > > > > > > >> > > > > > > > > There is some work on the application side needed. > > This > > > >> does > > > >> > > not > > > >> > > > > look > > > >> > > > > > > > like > > > >> > > > > > > > > getting better by throwing more hardware at it. > > > >> > > > > > > > > > > > >> > > > > > > > > It is really quite limiting to have no logs about > any > > > >> sort of > > > >> > > > > > > performance > > > >> > > > > > > > > indicators like call length to narrow down where the > > > >> > bottleneck > > > >> > > > is. > > > >> > > > > > > > > You may find some very low hanging fruits in terms > of > > > >> > > > optimisation > > > >> > > > > if > > > >> > > > > > > you > > > >> > > > > > > > > can simply concentrate on the top ten calls and > > optimise > > > >> > those. > > > >> > > > > > > > > Rather than looking at CPU and memory graphs. > > > >> > > > > > > > > > > > >> > > > > > > > > Thanks > > > >> > > > > > > > > Sebastian > > > >> > > > > > > > > > > > >> > > > > > > > > Sebastian Wagner > > > >> > > > > > > > > Director Arrakeen Solutions, OM-Hosting.com > > > >> > > > > > > > > http://arrakeen-solutions.co.nz/ > > > >> > > > > > > > > https://om-hosting.com - Cloud & Server Hosting for > > > HTML5 > > > >> > > > > > > > > Video-Conferencing OpenMeetings > > > >> > > > > > > > > < > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > > > https://www.youracclaim.com/badges/da4e8828-743d-4968-af6f-49033f10d60a/public_url > > > >> > > > > > > > > > > > > >> > > > > > > > > < > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > > > https://www.youracclaim.com/badges/b7e709c6-aa87-4b02-9faf-099038475e36/public_url > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > On Tue, 2 Feb 2021 at 17:18, [email protected] > < > > > >> > > > > > > > [email protected]> > > > >> > > > > > > > > wrote: > > > >> > > > > > > > > > > > >> > > > > > > > > > Have we ever looked into which java method would > > > require > > > >> > the > > > >> > > > most > > > >> > > > > > > > > > resources/time during the process of entering the > > > >> > conference > > > >> > > > > room ? > > > >> > > > > > > > > > > > > >> > > > > > > > > > Sebastian Wagner > > > >> > > > > > > > > > Director Arrakeen Solutions, OM-Hosting.com > > > >> > > > > > > > > > http://arrakeen-solutions.co.nz/ > > > >> > > > > > > > > > https://om-hosting.com - Cloud & Server Hosting > for > > > >> HTML5 > > > >> > > > > > > > > > Video-Conferencing OpenMeetings > > > >> > > > > > > > > > > > > >> > > > > > > > > > < > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > > > https://www.youracclaim.com/badges/da4e8828-743d-4968-af6f-49033f10d60a/public_url > > > >> > > > > > > > > > > > > >> > > > > > > > > > < > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > > > https://www.youracclaim.com/badges/b7e709c6-aa87-4b02-9faf-099038475e36/public_url > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > On Tue, 2 Feb 2021 at 16:48, Maxim Solodovnik < > > > >> > > > > > [email protected]> > > > >> > > > > > > > > > wrote: > > > >> > > > > > > > > > > > > >> > > > > > > > > >> While do load testing I did the following: > > > >> > > > > > > > > >> > > > >> > > > > > > > > >> create Jmeter test loading "semistatic" stateless > > > error > > > >> > page > > > >> > > > > with > > > >> > > > > > > 300 > > > >> > > > > > > > > >> simultaneous threads (I can share this test it is > > > very > > > >> > > simple) > > > >> > > > > > > > > >> CPU usage of OM process was near to 100% > > > >> > > > > > > > > >> the situation is better if Tomcat has more > threads > > > >> > > (maxThread > > > >> > > > > > > > parameter) > > > >> > > > > > > > > >> > > > >> > > > > > > > > >> I guess we need to check "The Ultimate Tomcat > > > >> Performace > > > >> > > > Guide" > > > >> > > > > > :))) > > > >> > > > > > > > > >> > > > >> > > > > > > > > >> On Tue, 2 Feb 2021 at 10:41, > [email protected] > > < > > > >> > > > > > > > > [email protected] > > > >> > > > > > > > > >> > > > > >> > > > > > > > > >> wrote: > > > >> > > > > > > > > >> > > > >> > > > > > > > > >> > Also the spikes are on the CPU actually more > than > > > on > > > >> the > > > >> > > > > memory: > > > >> > > > > > > > > >> > > > > >> > > > > > > > > >> > > > > >> > > > > > > > > >> > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > > > https://cwiki.apache.org/confluence/display/OPENMEETINGS/Performance+Testing#PerformanceTesting-ClusterPerformancetestresult02-022021 > > > >> > > > > > > > > >> > > > > >> > > > > > > > > >> > The spike is just 50-60 users. > > > >> > > > > > > > > >> > > > > >> > > > > > > > > >> > Why would CPU spike to almost 100% just for > that > > > >> amount > > > >> > of > > > >> > > > > > users ? > > > >> > > > > > > > > >> > > > > >> > > > > > > > > >> > I can try with 4GB for Openmeetings and repeat > > the > > > >> test. > > > >> > > > > > > > > >> > > > > >> > > > > > > > > >> > Thanks > > > >> > > > > > > > > >> > Seb > > > >> > > > > > > > > >> > > > > >> > > > > > > > > >> > Sebastian Wagner > > > >> > > > > > > > > >> > Director Arrakeen Solutions, OM-Hosting.com > > > >> > > > > > > > > >> > http://arrakeen-solutions.co.nz/ > > > >> > > > > > > > > >> > https://om-hosting.com - Cloud & Server > Hosting > > > for > > > >> > HTML5 > > > >> > > > > > > > > >> > Video-Conferencing OpenMeetings > > > >> > > > > > > > > >> > < > > > >> > > > > > > > > >> > > > > >> > > > > > > > > >> > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > > > https://www.youracclaim.com/badges/da4e8828-743d-4968-af6f-49033f10d60a/public_url > > > >> > > > > > > > > >> > > > > > >> > > > > > > > > >> > < > > > >> > > > > > > > > >> > > > > >> > > > > > > > > >> > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > > > https://www.youracclaim.com/badges/b7e709c6-aa87-4b02-9faf-099038475e36/public_url > > > >> > > > > > > > > >> > > > > > >> > > > > > > > > >> > > > > >> > > > > > > > > >> > > > > >> > > > > > > > > >> > On Tue, 2 Feb 2021 at 16:34, Maxim Solodovnik < > > > >> > > > > > > [email protected] > > > >> > > > > > > > > > > > >> > > > > > > > > >> > wrote: > > > >> > > > > > > > > >> > > > > >> > > > > > > > > >> > > On Tue, 2 Feb 2021 at 10:30, > > > [email protected] > > > >> < > > > >> > > > > > > > > >> > [email protected]> > > > >> > > > > > > > > >> > > wrote: > > > >> > > > > > > > > >> > > > > > >> > > > > > > > > >> > > > I think what you mean is you have > > OpenMeetings > > > >> and > > > >> > > MySQL > > > >> > > > > and > > > >> > > > > > > KMS > > > >> > > > > > > > > on > > > >> > > > > > > > > >> one > > > >> > > > > > > > > >> > > > instance with 4GB. > > > >> > > > > > > > > >> > > > > > > >> > > > > > > > > >> > > > But its 2GB Just for OpenMeetings. > > > >> > > > > > > > > >> > > > > > > >> > > > > > > > > >> > > > > > >> > > > > > > > > >> > > I mean > > > >> > > > > > > > > >> > > 4GB just for OM (demo-next) > > > >> > > > > > > > > >> > > 8GB just for OM (demo-prod) > > > >> > > > > > > > > >> > > and this might need to be increased in case > of > > > many > > > >> > > users > > > >> > > > > > > > > >> > > > > > >> > > > > > > > > >> > > Additionally Tomcat's maxThreads might need > to > > be > > > >> > > > increased > > > >> > > > > > > here: > > > >> > > > > > > > > >> > > > > > >> > > > > > > > > >> > > > > > >> > > > > > > > > >> > > > > >> > > > > > > > > >> > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > > > https://github.com/apache/openmeetings/blob/master/openmeetings-server/src/main/assembly/conf/server.xml#L74 > > > >> > > > > > > > > >> > > > > > >> > > > > > > > > >> > > I suspect lot's of simultaneous users need > more > > > >> > > resources > > > >> > > > > > > > > >> > > > > > >> > > > > > > > > >> > > > > > >> > > > > > > > > >> > > KMS is separated with another 2GB > > > >> > > > > > > > > >> > > > MySQL is on another server with another 2GB > > > >> > > > > > > > > >> > > > So that would be 6GB in total. But only 2 > are > > > >> > > allocated > > > >> > > > to > > > >> > > > > > > > > >> > OpenMeetings. > > > >> > > > > > > > > >> > > > > > > >> > > > > > > > > >> > > > XmX=2GB for OpenMeetings should be enough > and > > > not > > > >> > > crash > > > >> > > > > with > > > >> > > > > > > > 50-60 > > > >> > > > > > > > > >> > users > > > >> > > > > > > > > >> > > > entering the room at the same time. > > > >> > > > > > > > > >> > > > > > > >> > > > > > > > > >> > > > Thanks > > > >> > > > > > > > > >> > > > Sebastian > > > >> > > > > > > > > >> > > > > > > >> > > > > > > > > >> > > > Sebastian Wagner > > > >> > > > > > > > > >> > > > Director Arrakeen Solutions, OM-Hosting.com > > > >> > > > > > > > > >> > > > http://arrakeen-solutions.co.nz/ > > > >> > > > > > > > > >> > > > https://om-hosting.com - Cloud & Server > > > Hosting > > > >> for > > > >> > > > HTML5 > > > >> > > > > > > > > >> > > > Video-Conferencing OpenMeetings > > > >> > > > > > > > > >> > > > < > > > >> > > > > > > > > >> > > > > > > >> > > > > > > > > >> > > > > > >> > > > > > > > > >> > > > > >> > > > > > > > > >> > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > > > https://www.youracclaim.com/badges/da4e8828-743d-4968-af6f-49033f10d60a/public_url > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > > > >> > > > < > > > >> > > > > > > > > >> > > > > > > >> > > > > > > > > >> > > > > > >> > > > > > > > > >> > > > > >> > > > > > > > > >> > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > > > https://www.youracclaim.com/badges/b7e709c6-aa87-4b02-9faf-099038475e36/public_url > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > > > >> > > > > > > >> > > > > > > > > >> > > > > > > >> > > > > > > > > >> > > > On Tue, 2 Feb 2021 at 16:26, Maxim > > Solodovnik < > > > >> > > > > > > > > [email protected] > > > >> > > > > > > > > >> > > > > >> > > > > > > > > >> > > > wrote: > > > >> > > > > > > > > >> > > > > > > >> > > > > > > > > >> > > > > Hello Sebastian, > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > > > >> > > > > It seems 2GB of RAM is not enough for OM > > > >> > > > > > > > > >> > > > > `OutOfMemoryError: Container killed > > due > > > >> to > > > >> > > > memory > > > >> > > > > > > usage` > > > >> > > > > > > > > >> > > > > I never use less than 4GB (8-16GB in > > > >> production) > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > > > >> > > > > On Tue, 2 Feb 2021 at 09:54, Maxim > > > Solodovnik < > > > >> > > > > > > > > >> [email protected]> > > > >> > > > > > > > > >> > > > > wrote: > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > On Tue, 2 Feb 2021 at 07:23, > > > >> > > [email protected] > > > >> > > > < > > > >> > > > > > > > > >> > > > > [email protected]> > > > >> > > > > > > > > >> > > > > > wrote: > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > >> Hi, > > > >> > > > > > > > > >> > > > > >> > > > >> > > > > > > > > >> > > > > >> I have been conducting a few more > > > >> performance > > > >> > and > > > >> > > > > load > > > >> > > > > > > > tests > > > >> > > > > > > > > >> with > > > >> > > > > > > > > >> > > the > > > >> > > > > > > > > >> > > > > goal > > > >> > > > > > > > > >> > > > > >> of increasing participants to 100++. > > > >> > > > > > > > > >> > > > > >> > > > >> > > > > > > > > >> > > > > >> The challenge is: > > > >> > > > > > > > > >> > > > > >> *If more then 50-60 users dynamically > > > >> create a > > > >> > > room > > > >> > > > > > Hash > > > >> > > > > > > > > (using > > > >> > > > > > > > > >> > > > > Soap/Rest > > > >> > > > > > > > > >> > > > > >> API) and use that Hash to enter the > > > >> conference > > > >> > > room > > > >> > > > > CPU > > > >> > > > > > > and > > > >> > > > > > > > > >> memory > > > >> > > > > > > > > >> > > > > spikes > > > >> > > > > > > > > >> > > > > >> and server crashes* > > > >> > > > > > > > > >> > > > > >> > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > Can you share API call sequence? > > > >> > > > > > > > > >> > > > > > Maybe we can write JMeter scenario for > > > this? > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > server crash is something bad > > > >> > > > > > > > > >> > > > > > What is happening? Is it a JVM crash? > Or > > is > > > >> the > > > >> > > > system > > > >> > > > > > low > > > >> > > > > > > > of > > > >> > > > > > > > > >> > > resources > > > >> > > > > > > > > >> > > > > > and the kernel kills the trouble-maker? > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > >> *Test scenario observations:* > > > >> > > > > > > > > >> > > > > >> - It does not matter if those users > try > > > to > > > >> > enter > > > >> > > > the > > > >> > > > > > > same > > > >> > > > > > > > > >> room or > > > >> > > > > > > > > >> > > > > >> separate > > > >> > > > > > > > > >> > > > > >> rooms. In the above test scenario > it's a > > > >> mix of > > > >> > > 4x4 > > > >> > > > > > > > > conference > > > >> > > > > > > > > >> > rooms > > > >> > > > > > > > > >> > > > and > > > >> > > > > > > > > >> > > > > >> 20x1 webinars > > > >> > > > > > > > > >> > > > > >> - This can be reproduced stable and > > > >> > repetitively > > > >> > > > > > > > > >> > > > > >> - The issue starts with API calls > > taking > > > >> > 10sec++ > > > >> > > > and > > > >> > > > > > > > getting > > > >> > > > > > > > > >> more > > > >> > > > > > > > > >> > > > > slower. > > > >> > > > > > > > > >> > > > > >> Until the OpenMeetings Tomcat instance > > > >> crashes > > > >> > > > > > > > > >> > > > > >> - The issue also manifests that > > -BEFORE- > > > >> the > > > >> > > > server > > > >> > > > > > > > crashes > > > >> > > > > > > > > >> you > > > >> > > > > > > > > >> > can > > > >> > > > > > > > > >> > > > see > > > >> > > > > > > > > >> > > > > >> video pods not completing the > > > >> initialisation in > > > >> > > the > > > >> > > > > > > > > conference > > > >> > > > > > > > > >> > room > > > >> > > > > > > > > >> > > > > >> itself. > > > >> > > > > > > > > >> > > > > >> For example missing video pods or > video > > > pods > > > >> > > > without > > > >> > > > > a > > > >> > > > > > > > webcam > > > >> > > > > > > > > >> > > stream. > > > >> > > > > > > > > >> > > > > >> Likely to be linked to slow running > API > > or > > > >> > > > web-socket > > > >> > > > > > > calls > > > >> > > > > > > > > >> > > > > >> => I can deliver data samples or > > > >> screenshots if > > > >> > > > > > required > > > >> > > > > > > > via > > > >> > > > > > > > > >> our > > > >> > > > > > > > > >> > > > > >> confluence > > > >> > > > > > > > > >> > > > > >> space. > > > >> > > > > > > > > >> > > > > >> > > > >> > > > > > > > > >> > > > > >> *Hardware and software:* > > > >> > > > > > > > > >> > > > > >> - Server and OpenMeetings Instance is > > > >> isolated > > > >> > > on > > > >> > > > a > > > >> > > > > > > > > separated > > > >> > > > > > > > > >> > > > hardware > > > >> > > > > > > > > >> > > > > >> and > > > >> > > > > > > > > >> > > > > >> has 2GB of memory allocated > > > >> > > > > > > > > >> > > > > >> - There is no spike on KMS or > Database > > > >> > > > > > > > hardware/CPU/memory. > > > >> > > > > > > > > >> The > > > >> > > > > > > > > >> > > spike > > > >> > > > > > > > > >> > > > > is > > > >> > > > > > > > > >> > > > > >> only in the OpenMeetings Tomcat Server > > > >> instance > > > >> > > > > > > > > >> > > > > >> > > > >> > > > > > > > > >> > > > > >> *Possible ways to mitigate without > code > > > >> > changes:* > > > >> > > > > > > > > >> > > > > >> - You can mitigate part of this issue > > if > > > >> you > > > >> > > > spread > > > >> > > > > > the > > > >> > > > > > > > > users > > > >> > > > > > > > > >> to > > > >> > > > > > > > > >> > > > enter > > > >> > > > > > > > > >> > > > > >> over a longer time period. However it > > > needs > > > >> > more > > > >> > > > than > > > >> > > > > > > 10min > > > >> > > > > > > > > >> > > separation > > > >> > > > > > > > > >> > > > > to > > > >> > > > > > > > > >> > > > > >> enter without issues for 50-60 > > > participants > > > >> > > > > > > > > >> > > > > >> - You can mitigate part of this issue > > if > > > >> you > > > >> > for > > > >> > > > > > example > > > >> > > > > > > > > >> create > > > >> > > > > > > > > >> > the > > > >> > > > > > > > > >> > > > > >> room-hash in a different process (like > > 1h > > > >> > before > > > >> > > > > using) > > > >> > > > > > > and > > > >> > > > > > > > > >> once > > > >> > > > > > > > > >> > all > > > >> > > > > > > > > >> > > > > >> hashes > > > >> > > > > > > > > >> > > > > >> are created you enter the conference > > room. > > > >> It > > > >> > > still > > > >> > > > > > leads > > > >> > > > > > > > to > > > >> > > > > > > > > >> > issues, > > > >> > > > > > > > > >> > > > but > > > >> > > > > > > > > >> > > > > >> you can enter up to 100 users within > > > >> 5-10min, > > > >> > if > > > >> > > > you > > > >> > > > > > just > > > >> > > > > > > > use > > > >> > > > > > > > > >> the > > > >> > > > > > > > > >> > > > links, > > > >> > > > > > > > > >> > > > > >> rather than create the link AND > entering > > > >> with > > > >> > the > > > >> > > > > link > > > >> > > > > > at > > > >> > > > > > > > the > > > >> > > > > > > > > >> same > > > >> > > > > > > > > >> > > > > >> time/process > > > >> > > > > > > > > >> > > > > >> - Increasing Tomcat to more than 2GB > of > > > >> memory > > > >> > > per > > > >> > > > > > > Tomcat > > > >> > > > > > > > > >> > instance > > > >> > > > > > > > > >> > > > may > > > >> > > > > > > > > >> > > > > >> help, not sure by how much though > > > >> > > > > > > > > >> > > > > >> > > > >> > > > > > > > > >> > > > > >> => I think we should spend further > time > > > and > > > >> > > > propose > > > >> > > > > > ways > > > >> > > > > > > > to > > > >> > > > > > > > > >> get > > > >> > > > > > > > > >> > rid > > > >> > > > > > > > > >> > > > of > > > >> > > > > > > > > >> > > > > >> those spikes. The mitigations are not > > > >> realistic > > > >> > > to > > > >> > > > > > really > > > >> > > > > > > > be > > > >> > > > > > > > > >> able > > > >> > > > > > > > > >> > to > > > >> > > > > > > > > >> > > > use > > > >> > > > > > > > > >> > > > > >> in > > > >> > > > > > > > > >> > > > > >> practise. > > > >> > > > > > > > > >> > > > > >> > > > >> > > > > > > > > >> > > > > >> *My proposal is:* > > > >> > > > > > > > > >> > > > > >> There is further analysis needed: > > > >> > > > > > > > > >> > > > > >> - Capture all OpenMeetings calls that > > > >> happen > > > >> > > > during > > > >> > > > > > the > > > >> > > > > > > > > create > > > >> > > > > > > > > >> > room > > > >> > > > > > > > > >> > > > > hash > > > >> > > > > > > > > >> > > > > >> and conference room-enter > > > >> > > > > > > > > >> > > > > >> - Measure call lengths and any calls > > > during > > > >> > the > > > >> > > > > create > > > >> > > > > > > > room > > > >> > > > > > > > > >> hash > > > >> > > > > > > > > >> > > and > > > >> > > > > > > > > >> > > > > >> conference room-enter and specific CPU > > > >> spikes > > > >> > or > > > >> > > > > memory > > > >> > > > > > > > usage > > > >> > > > > > > > > >> > based > > > >> > > > > > > > > >> > > > on a > > > >> > > > > > > > > >> > > > > >> per call basis > > > >> > > > > > > > > >> > > > > >> - Eventually get a stack trace or > have > > a > > > >> > profile > > > >> > > > > > > available > > > >> > > > > > > > > >> that > > > >> > > > > > > > > >> > > > exports > > > >> > > > > > > > > >> > > > > >> the current in memory objects to > review > > > >> where > > > >> > and > > > >> > > > > what > > > >> > > > > > > > create > > > >> > > > > > > > > >> > those > > > >> > > > > > > > > >> > > > > spikes > > > >> > > > > > > > > >> > > > > >> > > > >> > > > > > > > > >> > > > > >> Once a per-call analysis is there it > > > should > > > >> be > > > >> > a > > > >> > > > lot > > > >> > > > > > more > > > >> > > > > > > > > easy > > > >> > > > > > > > > >> to > > > >> > > > > > > > > >> > > > > pinpoint > > > >> > > > > > > > > >> > > > > >> specific issues and propose > > improvements. > > > >> > > > > > > > > >> > > > > >> > > > >> > > > > > > > > >> > > > > >> As with all performance optimisation > > this > > > is > > > >> > > likely > > > >> > > > > to > > > >> > > > > > > need > > > >> > > > > > > > > >> more > > > >> > > > > > > > > >> > > > > >> discussion > > > >> > > > > > > > > >> > > > > >> once more detailed data is available. > > > >> > > > > > > > > >> > > > > >> > > > >> > > > > > > > > >> > > > > >> Thanks, > > > >> > > > > > > > > >> > > > > >> Sebastian > > > >> > > > > > > > > >> > > > > >> > > > >> > > > > > > > > >> > > > > >> Sebastian Wagner > > > >> > > > > > > > > >> > > > > >> Director Arrakeen Solutions, > > > OM-Hosting.com > > > >> > > > > > > > > >> > > > > >> http://arrakeen-solutions.co.nz/ > > > >> > > > > > > > > >> > > > > >> https://om-hosting.com - Cloud & > Server > > > >> > Hosting > > > >> > > > for > > > >> > > > > > > HTML5 > > > >> > > > > > > > > >> > > > > >> Video-Conferencing OpenMeetings > > > >> > > > > > > > > >> > > > > >> < > > > >> > > > > > > > > >> > > > > >> > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > > > >> > > > > > > >> > > > > > > > > >> > > > > > >> > > > > > > > > >> > > > > >> > > > > > > > > >> > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > > > https://www.youracclaim.com/badges/da4e8828-743d-4968-af6f-49033f10d60a/public_url > > > >> > > > > > > > > >> > > > > >> > > > > >> > > > > > > > > >> > > > > >> < > > > >> > > > > > > > > >> > > > > >> > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > > > >> > > > > > > >> > > > > > > > > >> > > > > > >> > > > > > > > > >> > > > > >> > > > > > > > > >> > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > > > https://www.youracclaim.com/badges/b7e709c6-aa87-4b02-9faf-099038475e36/public_url > > > >> > > > > > > > > >> > > > > >> > > > > >> > > > > > > > > >> > > > > >> > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > -- > > > >> > > > > > > > > >> > > > > > Best regards, > > > >> > > > > > > > > >> > > > > > Maxim > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > > > >> > > > > -- > > > >> > > > > > > > > >> > > > > Best regards, > > > >> > > > > > > > > >> > > > > Maxim > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > > > >> > > > > > > >> > > > > > > > > >> > > > > > >> > > > > > > > > >> > > > > > >> > > > > > > > > >> > > -- > > > >> > > > > > > > > >> > > Best regards, > > > >> > > > > > > > > >> > > Maxim > > > >> > > > > > > > > >> > > > > > >> > > > > > > > > >> > > > > >> > > > > > > > > >> > > > >> > > > > > > > > >> > > > >> > > > > > > > > >> -- > > > >> > > > > > > > > >> Best regards, > > > >> > > > > > > > > >> Maxim > > > >> > > > > > > > > >> > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > -- > > > >> > > > > > > > Best regards, > > > >> > > > > > > > Maxim > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > -- > > > >> > > > > > Best regards, > > > >> > > > > > Maxim > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > > >> > > > -- > > > >> > > > Best regards, > > > >> > > > Maxim > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > >> > -- > > > >> > Best regards, > > > >> > Maxim > > > >> > > > > >> > > > > > > > > > > > > -- > > > > Best regards, > > > > Maxim > > > > > > > > > > > > > -- > > > Best regards, > > > Maxim > > > > > > > > -- > Best regards, > Maxim >
