Chris, I have changed the connector config as below, it has improved the performance. I want to use this config to support at least 20k concurrent requests. I have tested this config and there is a delay in the response and noticed that it's coming from elastic search. I am trying to increase the number of replicas for elastic search to improve the performance. Could you please verify if the below connector config is good enough if I exclude elastic search tuning ?
<Connector port="8080" protocol="org.apache.coyote.http11.Http11NioProtocol" connectionTimeout="1000" maxConnections="40000" maxThreads="40000" processorCache="2000" minSpareThreads="4000" maxKeepAliveRequests="4000" URIEncoding="UTF-8" redirectPort="8443" /> On Thu, Nov 12, 2020 at 8:12 PM Christopher Schultz < ch...@christopherschultz.net> wrote: > Ayub, > > On 11/12/20 11:20, Ayub Khan wrote: > > Chris, > > > > That's correct, it's just a plain static hello world page I created to > > verify tomcat. It is served by tomcat. I have bundled this page in the > same > > context where the service is running. When I create load on the service > and > > then try to access the static hello world page browser keeps busy and > does > > not return the page. > > > > I checked the database dashboard and the monitoring charts are normal, no > > spikes on cpu or any other resources of the database. The delay is > > noticeable when there are more than 1000 concurrent requests from each > of 4 > > different JMeter test instances > > That's 4000 concurrent requests. Your <Connector> only has 2000 threads, > so only 2000 requests can be processed simultaneously. > > You have a keepalive timeout of 6 seconds (6000ms) and I'm guessing your > load test doesn't actually use KeepAlive. > > > Why does tomcat not even serve the html page > > I think the keepalive timeout explains what you are seeing. > > Are you instructing JMeter to re-use connections and also use KeepAlive? > > What happens if you set the KeepAlive timeout to 1 second instead of 6? > Does that improve things? > > -chris > > > On Thu, Nov 12, 2020 at 7:01 PM Christopher Schultz < > > ch...@christopherschultz.net> wrote: > > > >> Ayub, > >> > >> On 11/12/20 10:47, Ayub Khan wrote: > >>> Chris, > >>> > >>> I am using hikaricp connection pooling and the maximum pool size is set > >> to > >>> 100, without specifying minimum idle connections. Even during high > load I > >>> see there are more than 80 connections in idle state. > >>> > >>> I have setup debug statements to print the total time taken to complete > >> the > >>> request. The response time of completed call during load is around 5 > >>> seconds, the response time without load is around 400 to 500 > milliseconds > >> > >> That's a significant difference. Is your database server showing high > >> CPU usage or more I/O usage during those high-load times? > >> > >>> During the load I cannot even access static html page > >> > >> Now *that* is an interesting data point. > >> > >> You are sure that the "static" request doesn't hit any other resources? > >> No filter is doing anything? No logging to an external service or > >> double-checking any security constraints in the db before serving the > page? > >> > >> (And the static page is being returned by Tomcat, not nginx, right?) > >> > >> -chris > >> > >>> On Thu, Nov 12, 2020 at 4:59 PM Christopher Schultz < > >>> ch...@christopherschultz.net> wrote: > >>> > >>>> Ayub, > >>>> > >>>> On 11/11/20 16:16, Ayub Khan wrote: > >>>>> I was load testing using the ec2 load balancer dns. I have increased > >> the > >>>>> connector timeout to 6000 and also gave 32gig to the JVM of tomcat. I > >> am > >>>>> not seeing connection timeout in nginx logs now. No errors in > >> kernel.log > >>>> I > >>>>> am not seeing any errors in tomcat catalina.out. > >>>> > >>>> The timeouts are most likely related to the connection timeout (and > >>>> therefore keepalive) setting. If you are proxying connections from > nginx > >>>> and they should be staying open, you should really never be > experiencing > >>>> a timeout between nginx and Tomcat. > >>>> > >>>>> During regular operations when the request count is between 4 to 6k > >>>>> requests per minute the open files count for the tomcat process is > >>>> between > >>>>> 200 to 350. Responses from tomcat are within 5 seconds. > >>>> > >>>> Good. > >>>> > >>>>> If the requests count goes beyond 6.5 k open files slowly move up to > >>>> 2300 > >>>>> to 3000 and the request responses from tomcat become slow. > >>>> > >>>> This is pretty important, here. You are measuring two things: > >>>> > >>>> 1. Rise in file descriptor count > >>>> 2. Application slowness > >>>> > >>>> You are assuming that #1 is causing #2. It's entirely possible that #2 > >>>> is causing #1. > >>>> > >>>> The real question is "why is the application slowing down". Do you see > >>>> CPU spikes? If not, check your db connections. > >>>> > >>>> If your db connection pool is fully-utilized (no more available), then > >>>> you may have lots of request processing threads sitting there waiting > on > >>>> db connections. You'd see a rise in incoming connections (waiting) > which > >>>> aren't making any progress, and the application seems to "slow down", > >>>> and there is a snowball effect where more requests means more waiting, > >>>> and therefore more slowness. This would manifest as sloe response > times > >>>> without any CPU spike. > >>>> > >>>> You could also have a slow database and/or some other resource such > as a > >>>> downstream web service. > >>>> > >>>> I would investigate those options before trying to prove that fds > don't > >>>> scale on JVM or Linux (because they likely DO scale quite well). > >>>> > >>>>> I am not concerned about high open files as I do not see any errors > >>>> related > >>>>> to open files. Only side effect of open files going above 700 is the > >>>>> response from tomcat is slow. I checked if this is caused from > elastic > >>>>> search, aws cloud watch shows elastic search response is within 5 > >>>>> milliseconds. > >>>>> > >>>>> what might be the reason that when the open files goes beyond 600, it > >>>> slows > >>>>> down the response time for tomcat. I tried with tomcat 9 and it's the > >>>> same > >>>>> behavior > >>>> > >>>> You might want to add some debug logging to your application when > >>>> getting ready to contact e.g. a database or remote service. Something > >> like: > >>>> > >>>> [timestamp] [thread-id] DEBUG Making call to X > >>>> [timestamp] [thread-id] DEBUG Completed call to X > >>>> > >>>> or > >>>> > >>>> [timestamp] [thread-id] DEBUG Call to X took [duration]ms > >>>> > >>>> Then have a look at all those logs when the applications slows down > and > >>>> see if you can observe a significant jump in the time-to-complete > those > >>>> operations. > >>>> > >>>> Hope that helps, > >>>> -chris > >>>> > >>>> --------------------------------------------------------------------- > >>>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > >>>> For additional commands, e-mail: users-h...@tomcat.apache.org > >>>> > >>>> > >>> > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > >> For additional commands, e-mail: users-h...@tomcat.apache.org > >> > >> > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > For additional commands, e-mail: users-h...@tomcat.apache.org > > -- -------------------------------------------------------------------- Sun Certified Enterprise Architect 1.5 Sun Certified Java Programmer 1.4 Microsoft Certified Systems Engineer 2000 http://in.linkedin.com/pub/ayub-khan/a/811/b81 mobile:+966-502674604 ---------------------------------------------------------------------- It is proved that Hard Work and kowledge will get you close but attitude will get you there. However, it's the Love of God that will put you over the top!!