On 22 Dec 2005, at 18:16, Ralph Goers wrote:

We finally got some thread dumps from our production server. It shows something very different than what we were seeing in testing. First, this happens under light load after running for days. To summarize, many threads are waiting for the ResourceLimitingPool and several are waiting for the class loader. This system hasn't had the pools tuned so I'm not surprised about pool contention, but I don't believe that is the issue. That is because the thread holding the lock is simply waiting for the class loader. We took two traces and both were similar, but not identical. Different threads were holding the class loader lock in both. However, in both cases the threads holding the class loader lock were called from Castor while creating the portal layout.

So far, we have been speculating that the problem is due to a problem with the NPTL threads on Enterprise Linux 3. However, I'm wondering if perhaps castor is having problems and simply calling the class loader over and over.

I'd appreciate any ideas.

Ok, as far as I can see down the dumps you might have some problems with Catalina's classloader implementation locking up at 0x60b19148:

at org.apache.catalina.loader.WebappClassLoader.loadClass (WebappClassLoader.java:1255)

That seems odd though... I thought that code was debugged pretty thoroughly, unless, a seconday lock at 0x60cd9970 prevents the first one to be released...

Anyhow, from my experience, NPTL don't cause any whatsoever problem under Linux, but that said, I'm running on Jetty 4 with BEA JRockit 1.4.2. What VM and what container are you actually using?

        Pier


Attachment: smime.p7s
Description: S/MIME cryptographic signature