DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG·
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=36827>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND·
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=36827

           Summary: Need an option to severe socket connections between
                    mod_jk and ajp connector after request/response cycle.
           Product: Tomcat 5
           Version: 5.0.28
          Platform: All
        OS/Version: Linux
            Status: NEW
          Severity: major
          Priority: P2
         Component: Native:JK
        AssignedTo: tomcat-dev@jakarta.apache.org
        ReportedBy: [EMAIL PROTECTED]


Here’s the situation as it stands today and what can be done to solve it. I’ll 
try to keep this short.

Running configuration:

•       Running on Linux Red-Hat Ent 3
•       1 X F5 load balancer and hardware SSL box.
•       5 X Apache 1.3.33/mod_jk 1.2.14
•       6 X JBoss 4.0.0/Tomcat 5.0.28 using the AJP13 connector. 
•       Oracle 9i

Our production environment hosts a number of applications, each with different 
load and usage patterns. Our problem comes from the fact that it is difficult 
to find a web farm configuration that will satisfy every application. For 
reasons I will not explain here, we cannot have a dedicated web farm for each 
application.

This is what we think is happening in our production environment based on 
tests ran in UAT (User Acceptance Tests) and literature from the Apache and 
Tomcat products. This is all pretty new to us so if someone can provide hard 
facts, you are more than welcome.

1.      The 1.3 generation of Apache web servers will spawn a child process to 
handle an HTTP request. Only one HTTP request at a time can be processed by 
that child. 
2.      As the load increases on the web server, additional child processes 
will be spawned to concurrently serve the requests. There is a default limit 
to how many child processes can be forked. That limit defaults to 256 but has 
been changed in production to 16384. This is the MaxClients directive. It 
seems that production really needs the 16384 value instead of the 256 default. 
With 256, our web servers were rejecting connections and could not support the 
load generated by all of our clients.
3.      To prevent latency, Apache will maintain a maximum of 100 spare child 
processes alive. Spare means that they are not serving requests. Once reached, 
that number of spare servers does not seem to decrease. This is the number we 
see in our tests in UAT where 201 threads remain active in Tomcat. This is the 
100 spare children connections * 2 web server plus accept() thread. 
4.      If a request needs to be forwarded to Tomcat/JBoss (dynamic pages), 
the child process mod_jk module will instantiate a socket connection to the 
ajp13 connector in Tomcat. 
5.      Tomcat will accept the connection and create a thread to serve it. 
Connections will be accepted up to a concurrent maximum of 1200. This upper 
value has been set by us. 
6.      Tomcat will reject connections when the maximum is reached. JBoss 
4.0.0 has a known issue where the server will die when the maximum is reached. 
This has been fixed in 4.0.2. 
7.      A connection could potentially be recycled in mod_jk (recycle_timeout) 
if no activity occurs thru the connection. However, any requests to Tomcat 
from any user session-bound to that Tomcat instance could go thru the 
connection, thus keeping it active. Recycling does not seem to occur. We use a 
recycle_timeout value of 300.
8.      The fact that the production web servers can potentially serve up to 
16384 concurrent requests make it possible for a web server to instantiate an 
almost infinite number of connections to Tomcat and nuke it. 
9.      Tomcat can then become overloaded with connections. If a valid HTTP 
request comes thru Apache and is routed to a child process that has not yet 
made a connection to Tomcat, the connection could be impossible if Tomcat has 
already accepted its 1200 limit. 
10.     In that case, mod_jk could potentially fail over to another Tomcat. 
The user would however loose his session.
11.     The recycle_timeout and  cache_size options are of no use to us 
because too many web server children are created to serve the company load. 
Thus, many different routes can be taken by requests targeted to our 
application, keeping all the connection alive.
12.     We tried really small recycle_timeout values (e.g. 20) with no effect. 
A netstat reveals that connections remain ESTABLISHED. 
13.     The maxRequestsPerChild setting is set to 0 in PROD. It means that 
Apache child processes will never die, unless you reach the maxSpareServers 
value. Thus, at least 100 connections per web server will always remain 
actively connected to Tomcat. A > 0 value would at least guarantee that a 
child process would eventually die, freeing Tomcat connections and releasing 
back leaked memory to the OS. 

It’s hard to see a path out of this one.

One solution would be to reduce the MaxClients Apache config back to 256. This 
would mean that a single instance of Tomcat would not be hit by more than 256 
* 5 = 1280 (5 is the web farm size) connections. Our current jvm settings 
(heap + thread stack sizes) would allow us to do it. We would also need to 
bump our current 1200 limit a bit higher. However, this solution if not 
compatible with other applications which have really high loads.

Second option would be to patch mod_jk so that connections are dropped as soon 
as the response has been received from Tomcat. Drawbacks include preventing us 
from upgrading to new releases (unless we re-apply the modifications), 
introduce the risk of breaking something in this add-on, concentrate knowledge 
in the head of the person making the changes, introduce yet another component 
for the prod people to know and manage. The overhead of a connection is 
probably quite small but would need to be validated.

Finally, having our own web farm would be another solution. However, this goes 
against Production master plan of having only one web farm for production.

-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to