Hi Dov,
On Aug 27, 2008, at 4:18 PM, Dov Rosenberg wrote:
We have a WO (5.4.1)
I'd really very seriously think about moving to 5.4.2 to see if this
helps.
app that is deployed as a servlet in Tomcat 5.5 (Java 1.5). We do
not use Project Wonder or multiple ObjectStoreCoordinators. We have
experienced intermittent hanging issues under load. When we look at
the thread dumps I always see things like
"http-10042-Processor111" nid=60350 state=WAITING
- waiting on <0xcb1792> (a
com.webobjects.foundation.NSRecursiveLock)
- locked <0xcb1792> (a com.webobjects.foundation.NSRecursiveLock)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Unknown Source)
at
com.webobjects.foundation.NSRecursiveLock.lock(NSRecursiveLock.java:
72)
at
com
.webobjects
.eocontrol
.EOObjectStoreCoordinator.lock(EOObjectStoreCoordinator.java:466)
at
com
.webobjects
.eocontrol.EOEditingContext.lockObjectStore(EOEditingContext.java:
4735)
at
com
.webobjects
.eocontrol
.EOEditingContext
.objectsWithFetchSpecification(EOEditingContext.java:4112)
at
com
.webobjects
.eocontrol
.EOEditingContext
.objectsWithFetchSpecification(EOEditingContext.java:4500)
at
com
.webobjects
.eoaccess.EOUtilities.objectsMatchingValues(EOUtilities.java:193)
at
com
.webobjects
.eoaccess.EOUtilities.objectsMatchingKeyAndValue(EOUtilities.java:168)
...
Those threads are symptoms, not the problem. The problem is that
there is a hanging lock on EOObjectStoreCoordinator.
It seems the root of every thread always has NSRecursiveLock.lock()
as part of the thread dump. It doesn’t seem to matter if the call
was via EOUtilities or thru a fetch.
The real key line in the trace is
at
com
.webobjects
.eocontrol.EOObjectStoreCoordinator.lock(EOObjectStoreCoordinator.java:
466)
In a recent thread dump there were 286 threads listed of which 251
(all HTTP threads) had references to NSRecursiveLock.lock().
You might want to reduce the number of threads and listeners you
create so you find the problem sooner. The impact on your users may
be less or at least less frustrating.
A more interesting thing is what did the other threads show?
All of the threads were marked as state=WAITING (none were
RUNNABLE). It seems that all of the threads were waiting for
something and thus could not do anything but there was no Java
deadlocks being thrown.
Yes, they were waiting for some thread to unlock the OSC.
Questions:
Does this indicate that we have an issue with multiple threads
within the same JVM? We have ConcurrentRequest handling turned on.
No.
Should we investigate using multiple ObjectStoreCoordinators?
This might increase the time before the entire instance goes dead, but
it won't address the root problem.
Is this a red herring and should I look elsewhere for the problem?
You have found the symptom, now you need to find the problem:
* app running out of memory and not unlocking the OSC
* your code locking the OSC and not unlocking it in a finally block
* bug in WO 5.4.1 (hypothetical, I don't know of on) locking the OSC
and not unlocking it in a finally block
* deadlock or very long running transaction at the database level that
is preventing and EOF operation from completing in a reasonable amount
of time.
Chuck
--
Chuck Hill Senior Consultant / VP Development
Practical WebObjects - for developers who want to increase their
overall knowledge of WebObjects or who are trying to solve specific
problems.
http://www.global-village.net/products/practical_webobjects
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list ([email protected])
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com
This email sent to [EMAIL PROTECTED]