Hi Daryl,

I've got your dump by mail. It is quite huge. JRockit writes the essential
things at the bottom of the dump. So the best is to start reading there.
What you can see that there are 162 threads being locked - so nearlly as
many request are not responding or responding very slow. You do not have a
JVM deadlock situation, otherwise JRockit would recoginze and write that at
the bottom of the dump. Now you can only see "Blocked lock chains" and "Open
lock chains".

Basically you see in your dump that there are only three threads, who are
holding the locks (Ajp13Processor[8009][204] , Ajp13Processor[8009][137] and
Ajp13Processor[8009][173]) All other threads are waiting for the locks to be
released. So now you can go up there and see the stack trace of these three
threads and you try to figure out what is happening.

From the chains at the bottom you can also see that the biggest blocker for
whom most of the threads are waiting is Ajp13Processor[8009][204]. In the
thread dump you see this:

"Ajp13Processor[8009][204]" id=661 idx=0x278 tid=23430 prio=5 alive, *in
native, native_blocked, daemon
at jrockit/vm/Allocator.nativeGetNewTLA()I(Native Method)
at jrockit/vm/Allocator.getNewTLAAndAlloc(IIIZ)Ljava/lang/Object;(Unknown
Source)[inlined]
*at
jrockit/vm/Allocator.getMoreMemoryAndAlloc(IIIIZ)Ljava/lang/Object;(Unknown
Source)[optimized]
at java/util/Hashtable.getEnumeration(I)Ljava/util/Enumeration;(Unknown
Source)[inlined]
*at java/util/Hashtable.elements()Ljava/util/Enumeration;(Unknown
Source)[inlined]
at org/jahia/services/usermanager/JahiaGroup.isMember(JahiaGroup.java
:262)[optimized]
^-- Holding lock: java/util/[EMAIL PROTECTED] lock]
*at org/jahia/services/acl/JahiaACL.getUserPermissionInGroupHashtable(
JahiaACL.java:953)[inlined]

JRockit tries to allocate memory from the heap and is natively blocked,
because the JVM still tries to get some memory by garbage collecting. This
situation is slowly leading to an OutOfMemory exception, but you made the
dump before that.

About 100 threads are directly waiting for the lock created by the
Hashtable.elements() in JahiaGroup.isMember(), which is a synchroinzed
method.

In fact ALL threads are waiting for that, because also the ones who say
-- Blocked trying to get lock:
org/jahia/services/fields/[EMAIL PROTECTED] lock]

do indirectly also wait for the lock in JahiaGroup.isMember(), because the
lock in JahiaFieldBaseService is being held by thread
Ajp13Processor[8009][137] and that thread itself also waits for the ONE
Hastable.elements() to return, which does not get memory:

"Ajp13Processor[8009][137]" id=594 idx=0x1f2 tid=23264 prio=5 alive, in
native, blocked, daemon
*-- Blocked trying to get lock: java/util/[EMAIL PROTECTED] lock]
*at jrockit/vm/Threads.waitForSignal(J)Z(Native Method)
at
jrockit/vm/Locks.fatLockBlockOrSpin(ILjrockit/vm/ObjectMonitor;II)V(Unknown
Source)[optimized]
at
jrockit/vm/Locks.lockFat(Ljava/lang/Object;ILjrockit/vm/ObjectMonitor;Z)Ljava/lang/Object;(Unknown
Source)[optimized]
at
jrockit/vm/Locks.monitorEnterSecondStage(Ljava/lang/Object;I)Ljava/lang/Object;(Unknown
Source)[optimized]
*at java/util/Hashtable.elements()Ljava/util/Enumeration;(Unknown
Source)[inlined]
at org/jahia/services/usermanager/JahiaGroup.isMember(JahiaGroup.java
:262)[optimized]
*at org/jahia/services/acl/JahiaACL.getPermission(JahiaACL.java
:718)[inlined]
at org/jahia/services/acl/JahiaACL.getPermission(JahiaACL.java
:778)[optimized]
at org/jahia/services/acl/JahiaAbstractACL.getPermission(
JahiaAbstractACL.java:591)[inlined]
at org/jahia/data/fields/JahiaField.checkAccess(JahiaField.java
:378)[inlined]
at org/jahia/data/fields/JahiaField.checkReadAccess(JahiaField.java
:324)[inlined]
*at org/jahia/services/fields/JahiaFieldBaseService.loadField(
JahiaFieldBaseService.java:777)[inlined]
at org/jahia/services/containers/ContainerFactory.fullyLoadContainer(
ContainerFactory.java:148)[optimized]
^-- Holding lock:
org/jahia/services/fields/[EMAIL PROTECTED]
lock]
*at org/jahia/services/containers/ContainerFactoryProxy.load(
ContainerFactoryProxy.java:129)[inlined]
And then there is the lock in Ajp13Processor[8009][173]:

"Ajp13Processor[8009][173]" id=630 idx=0x23a tid=23360 prio=5 alive, in
native, blocked, daemon
*-- Blocked trying to get lock:
org/jahia/services/fields/[EMAIL PROTECTED] lock]
*at jrockit/vm/Threads.waitForSignal(J)Z(Native Method)
at
jrockit/vm/Locks.fatLockBlockOrSpin(ILjrockit/vm/ObjectMonitor;II)V(Unknown
Source)[optimized]
at
jrockit/vm/Locks.lockFat(Ljava/lang/Object;ILjrockit/vm/ObjectMonitor;Z)Ljava/lang/Object;(Unknown
Source)[optimized]
at
jrockit/vm/Locks.monitorEnterSecondStage(Ljava/lang/Object;I)Ljava/lang/Object;(Unknown
Source)[optimized]
at
jrockit/vm/Locks.monitorEnter(Ljava/lang/Object;)Ljava/lang/Object;(Unknown
Source)[optimized]
*at org/jahia/services/fields/JahiaFieldBaseService.saveFieldDefinition(
JahiaFieldBaseService.java:1478)
at org/jahia/registries/JahiaFieldDefinitionsRegistry.setDefinition(
JahiaFieldDefinitionsRegistry.java:156)
^-- Holding lock:
org/jahia/registries/[EMAIL PROTECTED] lock]*
at org/jahia/data/containers/JahiaContainerSet.declareField(
JahiaContainerSet.java:280)[optimized]
at org/jahia/taglibs/declarations/DeclareFieldTag.doStartTag(
DeclareFieldTag.java:114)[optimized]
In this stacktrace you can see that this thread is holding the lock for the
synchronized method in JahiaFieldDeinitionsRegistry.setDefinition, but the
thread itself waits for the JahiaFieldBaseService lock, which is being held
in thread 137, which waits for the lock in thread 204.

In the forthcoming Jahia 4.2 we already have multiple refactorings to
prevent these situations. We have removed many synchronized methods in order
to not overlock the application, when it is not necessary. We are also
releasing database connections for the waiting threads, in order to not
reserve too many connections, for threads, who are waiting anyway. Thus we
will also release memory. The cases in your thread dump are already known
and covered. If you have more thread dumps, please send them again, just to
go sure that there is no situation, which we have not seen until now.

What you could do so far:
If this customer has not been updated recently with a new release of your
templates, then the call to *JahiaFieldBaseService.saveFieldDefinition *should
not really happen too often. Please put a breakpoint in there on debugging
or set the logger for *JahiaFieldDefinitionsRegistry *to debug level. If the
field definition is being saved too often, then there is something wrong
with the way, how you definie the fields. Fixing this, will lead to lesser
locks.

You could also - until Jahia 4.2 - try limiting the amount of
requests Apache passes to Tomcat, and return a server busy error code or
message. This way Tomcat will not be bombed with requests and not run out of
memory as easy, as for now all those threads are at least opening a database
connection and not closing it before waiting for the lock.  In Jahia 4.2 most
of those threads will release the connections.

Hope I could help you,
Benjamin

2007/2/21, Daryl Gerlach <[EMAIL PROTECTED]>:

I found the output it's in the catalina.out. How do I make sense of the
output?

I see several entries like the following:

-- Blocked trying to get lock:
org/jahia/services/fields/[EMAIL PROTECTED] lock]

Where can I forward the zip file of the thread dump contents?

Daryl

Daryl Gerlach wrote:

Benjamin,

I tried the Kill -QUIT <pid> But it didn't work, process kept running with
no output. I'm running Red Hat ES4. Confused

Daryl

Benjamin Papez wrote:

Hello Daryl,

we had similar problems, because Jahia 4.1 was running out of database
connections. There were also some deadlock issues. Most of these problems
will be fixed with the forthcoming Jahia 4.2 release, which will be out
very soon. Meanwhile you could try to reserve enough DB connections. What
you could also do is to create a thread dump ( kill -QUIT *process_id*) of
the tomcat java process, when such a situation occurs. This will help
determining, what the threads are doing and why they are not finishing their
work.

Greetings,
Benjamin

2007/2/6, Daryl Gerlach <[EMAIL PROTECTED]>:
>
> We are having issues running Jahia 4.1.1 in production. We recently
> upgraded our servers to Red Hat ES 4 and added apache 2.0.52 with mod_jk
> 1.2.20 to front  end the whole thing. Under load the server stops
> responding to requests. Here are the settings for workers.properties,
> server.xml, and httpd.conf
>
> worker.properties
> ============
> worker.list=consumer,builder
>
> # Set properties for consumer (ajp13)
> worker.consumer.type=ajp13
> worker.consumer.host=localhost
> worker.consumer.port=8019
>
> worker.builder.type=ajp13
> worker.builder.host=localhost
> worker.builder.port=8009
>
>
> httpd.conf
> =======
> <IfModule mod_jk.c>
>    JkWorkersFile conf/workers.properties
>    JkShmFile    logs/mod_jk.shm
>    JkLogFile    logs/mod_jk.log
>    JkLogLevel   info
>    JkLogStampFormat "[%a %b %d %H:%M:%S %Y] "
> </IfModule>
>
> <VirtualHost *:80>
> ...
>    <IfModule mod_jk.c>
>        # Send everything for context /examples to worker named worker1
> (ajp13)
>        JkMount  /jahia/* builder
>        JkMount  /pdfgenerator/* builder
>    </IfModule>
> </VirtualHost>
>
>
> server.xml
> =======
> <Connector className="org.apache.ajp.tomcat4.Ajp13Connector "
>               port="8009" minProcessors="5" maxProcessors="75"
>               acceptCount="10" debug="0"/>
>
> What settings are other people using in production?
>
>

--
Daryl Gerlach
Consultant
Collective Intelligence, Inc.
Phone: 717-545-9234
Fax: 717-545-9232
Cell: 717-649-3499


--
Daryl Gerlach
Consultant
Collective Intelligence, Inc.
Phone: 717-545-9234
Fax: 717-545-9232
Cell: 717-649-3499


Reply via email to