Hello All,
Setup
-
We are researching into a site running a Webapplication using the
following setup:
IBM AIX 5.2
Tomcat 5.5.9
Java 5 64-bits (IBM Patch level is set to pap64dev-20061003a and
downloaded from the IBM URL of :
http://www-128.ibm.com/developerworks/java/jdk/aix/service.html)
There is 16Gb of memory on the server with 6 CPU on the server which is
running both Tomcat and the Oracle/Progress DBMS components.
Tomcat startup script has allocated 2 Gb of memory as min and max.
Symptoms
--
When Tomcat is under heavy load running a webappliccation
(at around 180 users although it varies), Tomcat 8080 will freeze,
whereby no more new uses can access the Webapplication, nor can they
access the Tomcat Manager page.
During this freezes/hanging, we can perform a telnet to
servername:8080 from a Unix prompt on the server, so it seems that the
Tomcat 8080 port is able to listen, although not able to process any
request.
The worst thing about this is when the hanging occurs, the normal
shutdown.sh script cannot shut down the hanging Tomcat process.
We also find that the kill command is ineffective using any signal,
including -9. Therefore, when the hanging occurs, the only to restart
Tomcat and the associated webapp is to reboot the whole server.
While the Tomcat server hangs, all other process seems to run correctly -
for example, we have an Oracle database and Progress DBMS components such
as Webspeed and Appserver and they all continue to run with no problem.
The CPU time is NOT fully utilized when the Tomcat freezes up.
We have not been able to find any error message associated with the
catalina.out log when the freeze occurs.
Research Done So Far
---
We did a search thru the email archive (http://marc.theaimsgroup.com/)
using the keywords of tomcat aix hangs, the most relevant entry I
found was the following:
http://marc.theaimsgroup.com/?t=11609794843r=1w=2
But the above uses much older component as follow, and did not seem like a
good match:
to our setup:
Tomcat 3.3.1a
AIX 5.3
IBM JDK 1.3.1
Apache/1.3.28
The last reply on the above posting was also not conclusive in what the
root cause was.
Other things found in Bugzilla which we initially thought were relevant
were:
http://issues.apache.org/bugzilla/show_bug.cgi?id=4
http://issues.apache.org/bugzilla/show_bug.cgi?id=34693
http://issues.apache.org/bugzilla/show_bug.cgi?id=32040
Again, upon closer inspection we were unable to find ways to apply the
info
in the above entries in our situation.
The one entry where we have extracted some application was the following:
http://issues.apache.org/bugzilla/show_bug.cgi?id=31142
I tried a telnet task with ANT 1.6.2:
- AIX 5.1 it's OK
- AIX 5.2 it's not OK the task freeze. I must make a CTRL-C to stop it
You may want to try adding -Djava.net.preferIPv4Stack=true to the Ant
command
line to see if that helps. I have seen some very long delays on AIX 5.2
with the
IBM 1.4 VM when it needs to resolve remote addresses.
So we have now implemented the above in our Tomcat startup script and see
whether
the Tomcat server will hold up.
Other things we are trying is to switch to using Java 5 32-bit to run our
Tomcat 5.5.9
instead of Tomcat 64-bit and see how things goes - the system has been
stable for 24 hours
so far using this setting.
Questions
Has anyone encountered similarly issues in the past, especially with
Tomcat 5 on AIX 5.2 + Java 5?
Any suggestion from you good folks on what other setting to configure in
order to increase our
chance to get some error messages logged during a system freezes/hanging?
Conclusion
Thanks for reviewing the above and any suggestion you might have on this.
Best Regards,
Matthew