Chuck,

On 24. 10. 2016, at 18:23, Chuck Hill <ch...@gevityinc.com> wrote:

> Running “sudo jstack –F <process id>” should dump a trace of all threads.  
> Should… 

Luckily, it did; just it is sort of weird that it identifies threads by name in 
the deadlock report, but by some unknown ID (perhaps hash of the Thread 
instance, looks like that) in the traces, which makes for sort of difficult 
parsing. I guess I'm not the first one to swear :)

> ... As for the hang, the code in the logging does seem like a likely culprit.

Actually it looks like the culprit is log4j itself?!? What the heck! Anyroad, 
here's the deadlock report:

===
"model.ReportPDFTask@5f6ce9a5":
  waiting to lock Monitor@0x000000012c0008a8 (Object@0x00000003f7efb980, a 
org/apache/log4j/spi/RootLogger),
  which is held by "WorkerThread6"
"WorkerThread6":
 waiting for ownable synchronizer 0x00000003f9538960, (a 
java/util/concurrent/locks/ReentrantLock$NonfairSync),
 which is held by "model.ReportPDFTask@5f6ce9a5"
===

the WorkerThread6 problem happens in the “log.info("$tch")” part, it looks like 
this (and although changesFromCommittedSnapshot happens to be part of the 
culprit, I believe the problem is actually caused by the logger itself): 
instead of rendering its argument immediately, as soon as it is determined that 
we should log, it does that self-evidently under a lock:

=== Worker6 (cleaned up considerably) ===
 - er.extensions.eof.ERXEC.lockObjectStore() // the lock W6 [2] hangs on, 
caused by...
 - er.extensions.eof.ERXGenericRecord.changesFromCommittedSnapshot() // ... 
changesFromCommittedSnapshot ...
 - cz.ocs.model.OCSEnterpriseObject.toString() // ... which yours truly dumbly 
uses in toString, alas!
 - org.apache.log4j.or.DefaultRenderer.doRender(java.lang.Object) // ... this 
thing happens INSIDE of log4j code
... lot of frames here; self-evidently, some of them acquires and holds 
Monitor@0x000000012c0008a8 [1] ...
 - org.apache.log4j.Category.info(java.lang.Object) // this is the 
log.info("$tch") of mine
... the loop and the other standard stuff up to dispatchRequest and 
WOWorkerThread.run() ...
===

The "model.ReportPDFTask" actually does not do anything wrong (far as I can 
say) -- it just fetches. A fetch presumably (and understandably) would lock the 
object store. With locked object store, it... logs! (I do not think the stack 
trace is important here; anyway, I have attached it at the end of this message 
for reference.)

Nothing wrong there I believe, it feels right to be able to log with a locked 
object store; but it causes deadlock, for the log tries to acquire its 
Monitor@0x000000012c0008a8 -- which is held by Worker6.

Now, although in my case the culprit happens to be unneeded (and generally 
dangerous) changesFromCommittedSnapshot, I believe that

(a) whatever which locks object store in the log would cause deadlock just as 
well
(b) which can be e.g., a fault fired, or lots of other perfectly valid things 
to log.

I do understand why log4j methods accept generic Object and render the result 
inside -- it is reasonable not to render anything unless we want to log with 
the current setting.

What seems to me to be a grave bug though is that this rendering happens under 
a lock, which causes a deadlock if two threads log concurrently and it so 
happens that

(i) one of them logs under object store lock (which I believe is valid and 
correct)
(ii) the other's log contents causes an object store lock (which again I 
believe is valid and correct, see (a) and (b) above).

So, well, is there indeed a grave bug inside of the ubiquitous log4j, or am I 
overlooking something of importance?

And if there is a bug, is there anything better one can do to work around it 
than using toStrings for all non-trivial logs, like 
“log.info("non-trivial-contents".toString())”, which would affect efficiency 
pretty bad?

Thanks and all the best,
OC

=== "model.ReportPDFTask@5f6ce9a5" slightly cleaned up ===
 - org.apache.log4j.Category.callAppenders(org.apache.log4j.spi.LoggingEvent) 
@bci=12, line=204 (Compiled frame) // hangs on Monitor@0x000000012c0008a8, 
which is held by [1] above
 - org.apache.log4j.Category.forcedLog(java.lang.String, 
org.apache.log4j.Priority, java.lang.Object, java.lang.Throwable) @bci=14, 
line=391 (Compiled frame)
 - org.apache.log4j.Category.info(java.lang.Object) @bci=38, line=666 (Compiled 
frame)
 - 
er.extensions.eof.ERXEOAccessUtilities.logExpression(com.webobjects.eoaccess.EOAdaptorChannel,
 com.webobjects.eoaccess.EOSQLExpression, long) @bci=406, line=1307 (Compiled 
frame)
 - 
er.extensions.eof.ERXAdaptorChannelDelegate.adaptorChannelDidEvaluateExpression(com.webobjects.eoaccess.EOAdaptorChannel,
 com.webobjects.eoaccess.EOSQLExpression) @bci=76, line=81 (Compiled frame)
 - com.webobjects.foundation._NSDelegate.perform(java.lang.String, 
java.lang.Object, java.lang.Object) @bci=14, line=163 (Compiled frame)
 - 
er.extensions.jdbc.ERXJDBCAdaptor$Channel.evaluateExpression(com.webobjects.eoaccess.EOSQLExpression)
 @bci=2, line=286 (Compiled frame)
 - 
com.webobjects.eocontrol.EOObjectStoreCoordinator.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
 com.webobjects.eocontrol.EOEditingContext) @bci=97, line=488 (Compiled frame) 
// somewhere in this code probably object store is locked [2], which sort of 
makes sense
 - 
com.webobjects.eocontrol.EOEditingContext.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
 com.webobjects.eocontrol.EOEditingContext) @bci=79, line=4069 (Interpreted 
frame)
 - 
er.extensions.eof.ERXEC.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
 com.webobjects.eocontrol.EOEditingContext) @bci=10, line=1308 (Interpreted 
frame)
 - model._DBReport.fetchDBReports(com.webobjects.eocontrol.EOEditingContext, 
com.webobjects.eocontrol.EOQualifier, com.webobjects.foundation.NSArray) 
@bci=103, line=141 (Interpreted frame) // my code which fetches
... the rest, I believe, is irrelevant ...
===



 _______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list      (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Reply via email to