Hi Shapira and Tomcat Users,
Big apology to u all for this problem in a single thread again. (Its all linked, 
cannot separate them)
It would be pretty obvious from the mail I have wrote in the past about the production 
Issues. This time we have identified some possible areas where we think there is some 
mistake/issues. Before going to that,let me again brief about the Production 
Architecture:
(Please dont talk about versions right now, believe me they are still worth and 
wonderful)
jdk - 1.4.2
apache - 1.3.27
Tomcat - 3.3
ajp version 1.2
4 Solaris boxes - Each Box has 1 Apache - 3 Tomcats making 4 apaches and 12 tomcats in 
all. Please see the enclosed image.
Problem: On a given day of a week, there is lot load on 2 tomcats 4_3 and 1_1 and in 
this sequence. The load balancer setting provided in the worker.properties is:
For Apache 1 - 4_3, 4_2, ...., 1_2, 1_1 
For Apache 2 - 1_1, 1_2, ...., 4_2, 4_3 
For Apache 3 - 4_3, 4_2, ...., 1_2, 1_1 
For Apache 4 - 1_1, 1_2, ...., 4_2, 4_3 
So if for example, there is high load on this day then 4_3 tomcat will have 290 thread 
in the JVM as the ajp connector module setting is specified to 250. The thread dumps 
indicated 258 threads inside the JVM.
The next tomcat in the list is 1_1 and then if the load still persists 4_2 and then 
1_2 (round robin deaths).
If I were to suppose that mod_jk is doing load balancing effectively then why is it 
that only these specific instances of tomcat or the order is getting spiked (Very 
Strange???). This pattern we have only been able to see after a month of issues and 
analysis. (Other days all the tomcats are happy and so are the clients)
All tomcats have an individual pool with the max and min connections to 100 and 30. 
The number of threads 250 ajp and number of threads evident form Thread Dump 290, 
which seems there is excess threads and starvation only on the spiked tomcat and 
connections become the bottle neck.
So, when the problem occurs the observations are following:
Tomcat 1_1: 290 ]
Tomcat 1_2: 50  ]All web - 1 making 400 connections (Dynamic Scheme!!! which is evil - 
Shapira)
Tomcat 1_3: 50  ]
Tomcat 2_1: 50
Tomcat 2_2: 50
Tomcat 2_3: 50
Tomcat 3_1: 50
Tomcat 3_2: 50
Tomcat 3_3: 50  
Tomcat 4_1: 50  ]
Tomcat 4_2: 50  ]All Web - 4 making 400 connections
Tomcat 4_3: 290 ]
Total DB connections: 600 (from 2 tomcats) + 500 (under peak load) = 1100 connections 
(This makes the application cry till we kill these 2 tomcats) (Allowed connections 
350*4 = 1400)
So, what happens to the site where the application is running, one might ask ?
The apache connections are reached the max count (350) and they start giving 404 error 
to the users and leave us clueless what is going wrong ? Why isn't the load redirected 
to other tomcats who are idle. What is hapenning to the worker.
Overall the things we thought we should do:
- Reduce the ajp connector threads per tomcat from 250 to 175 
- Remove dynamic scheme and use fixed_wait scheme (Point brought by Shapira and our 
think tank here finally)
- Pool Connections should be increased (Yet to be debated???)(Additional 75 thread 
support needed beyond the current pool limit else these 75 will have to wait but then 
for how long???is it good???Threading model of jdk1.4.2 uses LWP synch)
- Change the load balancing scheme, as there is something wrong???

Last but not least, it seems there is only one worker thread per apache to redirect 
the request to all other tomcats using mod_jk. So 4 in all worker threads. Are they 
sufficient to balance the load under these circumstances. If the load is more on 1_1 
or 4_3, can there be a case when out of these 4, 1 or 2 may have died and the 
remaining ones are all redirecting to only 4_3 (Just a coincidence). If this is the 
case then case is a worse scenario. No one has raised this issue in my team, but today 
while going through the architecture this was evident. Secondly, if we increase the 
number of worker threads, can we see any improvements.
I need some good explanations about what the worked threads does??How does it balance 
the load??
I know this may sound very strange but any comments/suggestions from anyone from this 
list would be appreciated.
Special thanks to shapira for answering my previous questions and hope he will 
continue.
~Arnab

Reply via email to