Hi all!

We run a fairly large web application which we're currently trying to do some 
load tests on but we're experiencing some sporadic errors which we can't find 
the cause of.

We run a load test scenario using the Proxysniffer load testing tool on a 
machine connected to the same switch as the server under load. The load test 
simulates 3100 users looping over 27 pages of varying complexity. Each loop 
takes 2175 seconds on average and the average response time per page is 0.16 
seconds. The test runs for about 5 hours and after a while, normaly around 1 
hour but sometimes as soon as after a little more than 30 minutes and sometimes 
longer, there are occasional errors. The errors always come clustered with a 
bunch on each occurance. After each occurance everything runs fine for a lenght 
of time until the next occurance.

Proxysniffer reports all errors as "Network Connection aborted by Server" but 
when we look at each error in detail we can see that they don't all occur at 
the same stage in the request cycle. Some occur on "transmit http request", 
some on "open network connection", some on "wait for server response", but all 
within the same second.

On one of the tests we had a total of more than 3000000 requests and had only 
14 errors divided over 2 occations during the 5 hour test.

The problem is 100% reproducable with the current setup and the setups we've 
tested but the errors occur with some randomness.

The application logs show nothing unusual. The access logs show nothing 
unusual. We've included the session ids in the tomcat logs and the failing urls 
doesn't show up in the access log at all for the given session id (cookies are 
shown in the error report). 

During the test the machine is under some load, but I wouldn't call it heavy 
load. The application is quite database intensive so postgres works a lot 
harder than java/tomcat.

At first we used apache 2.2 with mod_jk to in front of tomcat and the errors 
were more numerous at that time and we got a bunch of errors in the mod_jk.log 
stating apache could not connect to tomcat. To be able to pinpoint the problem 
we've now excluded apache httpd and run only tomcat with the NIO HTTP 
connector. We also tried the vanilla HTTP connector.

We've tried to use both the default garbage collector with default settings and 
the flags "-XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+CMSIncrementalMode". 
No significant difference in times and errors with both settings.

We've been able to match some of the errors with full collections reported by 
the flags "-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps" but some 
errors occur where there are no full GC occuring.



I'm running out of ideas here... What am I missing? What am I doing wrong? What 
could I try?



The full JVM flags are:

# general options
JAVA_OPTS="-server -Dbuild.compiler.emacs=true"
# Memory limits (we've tried both higher and lower values here)
JAVA_OPTS="${JAVA_OPTS} -XX:MaxPermSize=192m -Xmx1800m -Xms1800m"
# GC logging
JAVA_OPTS="${JAVA_OPTS}  -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps"
# GC engine (Tried with excluding this and usinging the default values)
#JAVA_OPTS="${JAVA_OPTS}  -XX:+UseConcMarkSweepGC -XX:+UseParNewGC 
-XX:+CMSIncrementalMode"
# GC tuning (tried with excluding these as well)
#JAVA_OPTS="${JAVA_OPTS}  -Xmn2g -XX:ParallelGCThreads=8 -XX:SurvivorRatio=8 
-XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=31"
# JVM options
JAVA_OPTS="${JAVA_OPTS} -Dfile.encoding=utf-8 -Djava.awt.headless=true"


Software involved:
FreeBSD 8.0-RELEASE-p2 with diablo-jdk1.6.0 (we also tried openjdk6). Tomcat 
6.0.26 (previously 6.0.20 with same problem). The application uses 
org.apache.commons.dbcp.BasicDataSource to connect to postgresql 8.4.2 on the 
same machine. Most part of the application uses hibernate and ehcache to access 
the database but some part use vanilla jdbc and some older parts still use a 
homebrew connection pool. We use spring for transaction management and 
autowiring of some handler/service objects.

Hardware:
16 CPU cores (Intel(R) Xeon(R) X5550  @ 2.67GHz)
32 GB RAM


Thanks in advance,
Patrik Kudo


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to