Can anybody confirm this?

Joerg

On 11.12.2003 13:27, Tor-Einar Jarnbjo wrote:

Hi,

we are having a rather strange problem with a Cocoon based web application,
which seem to be load dependent and we think we have found the problem to be related to XMLSerializers not being handed back to the ResourceLimitingPool when processing requests.


We've added a few log outputs to the pool's get and put methods and found, that at some point, Cocoon seems to stop returning XMLSerializers to the pool (or at least does it only very rarely). As soon as this happens, the pool starts creating new instances of the XMLSerializer and this is obviously a very expensive operation, which among other things includes reading files from the harddisk. This causes Tomcat's thread pool to drain and in effect the system to stop working at all.


One such "crash" could look like this (with the pool size set to 1024)


At 14:52:49, the pool has 249 instances created, of which 190 are waiting in the pool.

Within the next 30 seconds, only 15 serializers are returned to the pool, but 205 are requested, causing the pool to be empty at 14:53:
20.


At 14:53:24 another three serializers are returned to the pool, but in the meantime 11 serializers have been requested, causing the pool to grow to a size of 257 instances.

The next serializers are now not returned to the pool until 14:54:
57 (90 seconds later). In the meantime, the pool has grown to 350 instances and Tomcat's thread pool is already drained, as the server is not able to server the requests fast enough.


In the following three minutes until 14:57:10, there are 80 serializers returned to the pool, almost 400 are beeing requested and the pool has grown to 650 instances.

In the next two minutes until 14:59:00, the "tide" changes and there are about 150 serializers returned, but only 50 requested, so that the pool stays the same size and now has about 100 serializers cached.


From now, there are again almost no serializers returned to the pool
and it reaches a size of 1000 instances at 15:03:00. We restarted the server around 15:04:30 as it did not seem to recover from the situation.


Such episodes are now occuring about once an hour or so when the server is running under high load. We have a load balancing system in front of two servers, but as soon as one server "hangs" and all the requests are redirected to the other server, this one will also fail with the same symptomes within less than a minute.


To put my question short: Does someone have an idea why this is happening and what we could do to solve the problem?


Tor


Reply via email to