Hi.

This kind of question is very difficult to answer reasonably, for anyone that is not you and does not have direct access to your system, to see what happens, when it happens. The general and reasonable answer would be that you need to use some monitoring tools, to find out where exactly the bottleneck is, and then, very carefully, start tuning your system one parameter at a time, to try to improve the situation. The worst thing to do is to start changing multiple things without really knowing what they are doing, because then you will get very confused very quickly.

Ah, one more thing : the default parameters of both Apache httpd and Tomcat, are chosen by people who know what they are doing, to cover a majority of reasonable cases. So, changing these parameters without knowing exactly what each parameters does, and how different parameters interact with one another, is always risky.

(This being said, the mere fact that you are asking on this list before doing that, is a bonus point for you).

All this being said, a couple of notes below :

Hernán Marsili wrote:
Hi,

For the past 4 years we has been working with a 'stable' configuration in
which we put APACHE in front of TOMCAT7 (previously Tomcat6) with mod_jk
connector. We usually serve high traffic sites with about 7000 to 10.000
concurrent users per box (8gb RAM / 4 vcpu) (50.000 active users total).


So, basically, your configuration is fine and has been running reliably and in a stable way for 4 years, including a Tomcat version change.
Good.  One more reason for only changing things carefully.

We are OK with the performance, but sometimes we notice Tomcat stops
responding normally while there are at least 2 full CPU left to be consumed
(JAVA memory is fine).

You may want to indicate a bit more precisely what you mean by "sometimes" and by "stops responding normally".


This is the configuration we use for the connector:

 <Connector port="8009" protocol="AJP/1.3" address="127.0.0.1"
emptySessionPath="true" redirectPort="8443" maxThreads="1024"
minSpareThreads="32" enableLookups="false" request.registerRequests="false"
/>


Note # 1 : you say that you have up to 10,000 concurrent users.
Yet, there are only 1024 Threads in Tomcat.

"Users" is not necessarily equal to "requests", but let's assume for a moment 
that they are.

Basically, Tomcat will use one Thread to process one HTTP request, from the time the request is received, to the time when the response to that request has been sent back to the user. So, maybe, there are times when your Tomcat is running out of available Threads to process all the requests that come in at some moments ?
If that is the case, what will happen is :
- The TCP/IP stack on Tomcat will accept the client connection
- but this connection will be put in a queue, waiting for a thread to become available (iow : when it finishes the current request that it is processing). If all 1024 threads are currently busy processing requests (or waiting for additional requests from the same client, because of the keep-alive timeout, see below), then it will appear for the client as if Tomcat is "not responding normally".

I have a couple of questions:
1) should we set a particular connector or let Tomcat7 decide? I understand
using protocol="AJP/1.3" the auto-switch kicks in. But, for non-SSL high
concurrency sites maybe is best to fixed on APR?


I cannot answer that and will wait for someone else more qualified to do that.


2) how many THREADS can we have? can we go beyond the 1024?

Yes, there is no limit other than the available memory and the general performance of the machine. That is also a very easy parameter to change, and one that does not have a lot of obscure side-effects.
Apart from everything else, I would suggest raising it to e.g. 4096, and see 
what happens.
(As someone else commented however : if the problem is not really in Tomcat, but in some back-end database server, then this will make things worse).


3) is there any advantage on using processorCache?


don't know.

4) We are not defining a CONNECTION TIMEOUT not a KEEP ALIVE. Any advice on
this one? The average user session is 7 minutes.


I do not rememember what the default value is for keep-alive. But this may also be something easy to adjust, and with potentially big effects.
If it is currently anywhere larger than some 5 seconds, change it to 5 seconds 
maximum.

The keep-alive logic was introduced at a time when networks were slower, and when setting up a new TCP/IP connection between a client and a server for each new request, was quite "expensive". So the idea was : after a first request by a client, let's leave the connection open, to see if this client has more requests to send, over that same connection, within just a few seconds. This way, we avoid closing the connection each time, and have to re-create a connection for each request. (Example : a HTML page, with a lot of <img> tags in it.).

Unfortunately, what happens in a simple configuration like yours, is this :
- the client opens a connection and sends a request
- tomcat allocates a thread to serve that connection and that request
- the tomcat thread processes the request and sends back the answer
(typically, all the above takes a few milliseconds)
- but then, the same thread stays alive, waiting on that connection, for the duration of the keep-alive timeout (maybe as much as 30 seconds!), just to see if the client has any more requests to send over that same connection. If the client sends more requests immediately, that's fine : we have saved the overhead of having to close and re-open a connection (and that is the point of the keep-alive). But if the client does not send any more requests, then during the whole time of the keep-alive, you are blocking one thread, which does nothing except waiting (and this thread cannot be re-used to serve another request from another client). (So, with your 1024 threads, you could very well have at some times several hundred which are doing nothing, and just waiting for their keep-alive timeout to expire).

There are other ways to tune this behaviour, but reducing the keep-alive timeout to a few seconds maximum is easy and can provide a quick improvement, and would not have any bad side-effects even if you determine later that the problem is somewhere else.


Connection timeout is something else, and is basically a way to avoid certain forms of DOS attacks. I think that nowadays, that can also be reduced to something like 5-10 seconds, but it probably is not an immediate reason for your Tomcat responsiveness problems.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to