Hi,

this is the first time I have posted to this or any other discussion group. Apologies if I am not following correct protocol, hopefully someone will correct me if that is the case.

I have checked the archive but have been unable to find anything that answers my questions. Again, apologies if I've missed stuff.

First some background. We have a web app that runs on a government secure intranet. The app is essentially a forms application that allows users to submit a variety of expenses and excess hours claims to a legacy mainframe payroll system. The app uses only JSPs, servlets, beans and straight JDBC calls to a Mysql database backend. The hardware is made up of two quad 1.8Mhz windows 2000 servers each with 3Gb of memory. The app runs in parallel on each server (i.e. sticky sessions, not distributed) using a fairly crude but effective load balancing that splits users 50/50 between the servers. Each server also hosts a Mysql database, the live database on one and an archive database on the other. We realise that this is not an ideal configuration, however circumstances leave us no choice. This is a very heavily used application, it has over 120,000 registered users and regularly handles over 2,000 concurrent users.

The app currently runs very successfully on Tomcat 4.0.5 with the following configuration.

   minProcessors = 250
   maxProcessors = 750
   enableLookups = false
   acceptCount = 200
   connectionTimeout = -1
   application context reloadable = false

The Java heap size is set to a minimum of 512M and a max of 1024M.

We have been considering upgrading Tomcat to a later version, and for the latest release installed version 4.1.24 on our development and test machines. Unit and functional testing went well but we were unable to carry out any load testing due to the lack of suitable hardware to set up a test environment that mirrored live. We therefore decided to deploy to live using 4.0.5 again. However, on carrying out final acceptance testing in the live environment we experienced several unexplained crashes. Things that had worked fine in test were suddenly failing in live. My initial assumption (at 11pm at night with heavy pressure to get the app deployed ready for 4am next morning) was that the problem must be in Tomcat 4.0.5, as everything had worked fine in test. I therefore decided to upgrade live to 4.1.24 using the same configuration as we had for 4.0.5 shown above. Following the upgrade acceptance testing went fine with everything working well just as it had in test.

However, the major problems showed up next morning as soon as the load began to build up. As soon as concurrent usage went over 100 we received the error "threadpool full, no more threads available - please increase maxProcessors". The app ws very unstable with some users able to log on but most receiving a "page unavailable" message. All sorts of new / unexplained errors showed up in the logs, most of them due to attributes not being found in the httpsession where they should have been. Memory usage on the servers was much higher than it had been and performance degraded badly. The situation got so bad that we had to bring the application down.

We then decided to re-examine the problems we'd had the previous night. We discovered that the issues with 4.0.5 had actually been code errors that were found by the 4.0.5 jsp compiler/engine but were ignored by 4.1.24. For example,
1. A semi colon placed after a page import in a JSP, compilation failure in 4.0, no error in 4.1
2. Forwarding to a URL for a JSP page that did not exist, servletException in 4.0, no error in 4.1, which appeared to treat the non-existent page as null, which as it happens was the correct behaviour for our app but might not have been.
3. Doing an Integer.parseInt on a String value taken from the http session. When the String is null get a NumberFormatException in 4.0 but no error in 4.1, which appeared to treat null as zero, which again was the corect behavior for our app but again might not have been.


After we fixed these problems we rolled back the live servers to 4.0.5 and within ten minutes the application was happily handling over 1200 concurent users with response times back to well below half a second.

Sorry if this has turned into something of an essay, but I have tried to include all information that might be relevant.

I would be very grateful for answers to the following questions.
1. Why was 4.0.5 able to detect these errors while 4.1.24 ignored them, are they changes to the JSP / Servlet specs or are they bugs?
2. Why was performance and stability so much worse with 4.1.24, have we missed some additional configuration changes that are required or is 4.1 just worse than 4.0.
3. Are these problems general to all versions of 4.1 or is 4.1.24 just a bad release?


The Apache Tomcat pages advise users to upgrade to version 5, following our experiences trying to go to 4.1 we don't feel confident at all about trying to move to version 5.

Thanks very much for your time,

Mike.


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to