Hmmm, looks like we are not the only one who got bit by that thing in tender 
parts: http://stackoverflow.com/a/7397857

> On 25. 10. 2016, at 10:59 PM, OC <o...@ocs.cz> wrote:
> 
> Chuck,
> 
> On 24. 10. 2016, at 18:23, Chuck Hill <ch...@gevityinc.com> wrote:
> 
>> Running “sudo jstack –F <process id>” should dump a trace of all threads.  
>> Should… 
> 
> Luckily, it did; just it is sort of weird that it identifies threads by name 
> in the deadlock report, but by some unknown ID (perhaps hash of the Thread 
> instance, looks like that) in the traces, which makes for sort of difficult 
> parsing. I guess I'm not the first one to swear :)
> 
>> ... As for the hang, the code in the logging does seem like a likely culprit.
> 
> Actually it looks like the culprit is log4j itself?!? What the heck! Anyroad, 
> here's the deadlock report:
> 
> ===
> "model.ReportPDFTask@5f6ce9a5":
>  waiting to lock Monitor@0x000000012c0008a8 (Object@0x00000003f7efb980, a 
> org/apache/log4j/spi/RootLogger),
>  which is held by "WorkerThread6"
> "WorkerThread6":
> waiting for ownable synchronizer 0x00000003f9538960, (a 
> java/util/concurrent/locks/ReentrantLock$NonfairSync),
> which is held by "model.ReportPDFTask@5f6ce9a5"
> ===
> 
> the WorkerThread6 problem happens in the “log.info("$tch")” part, it looks 
> like this (and although changesFromCommittedSnapshot happens to be part of 
> the culprit, I believe the problem is actually caused by the logger itself): 
> instead of rendering its argument immediately, as soon as it is determined 
> that we should log, it does that self-evidently under a lock:
> 
> === Worker6 (cleaned up considerably) ===
> - er.extensions.eof.ERXEC.lockObjectStore() // the lock W6 [2] hangs on, 
> caused by...
> - er.extensions.eof.ERXGenericRecord.changesFromCommittedSnapshot() // ... 
> changesFromCommittedSnapshot ...
> - cz.ocs.model.OCSEnterpriseObject.toString() // ... which yours truly dumbly 
> uses in toString, alas!
> - org.apache.log4j.or.DefaultRenderer.doRender(java.lang.Object) // ... this 
> thing happens INSIDE of log4j code
> ... lot of frames here; self-evidently, some of them acquires and holds 
> Monitor@0x000000012c0008a8 [1] ...
> - org.apache.log4j.Category.info(java.lang.Object) // this is the 
> log.info("$tch") of mine
> ... the loop and the other standard stuff up to dispatchRequest and 
> WOWorkerThread.run() ...
> ===
> 
> The "model.ReportPDFTask" actually does not do anything wrong (far as I can 
> say) -- it just fetches. A fetch presumably (and understandably) would lock 
> the object store. With locked object store, it... logs! (I do not think the 
> stack trace is important here; anyway, I have attached it at the end of this 
> message for reference.)
> 
> Nothing wrong there I believe, it feels right to be able to log with a locked 
> object store; but it causes deadlock, for the log tries to acquire its 
> Monitor@0x000000012c0008a8 -- which is held by Worker6.
> 
> Now, although in my case the culprit happens to be unneeded (and generally 
> dangerous) changesFromCommittedSnapshot, I believe that
> 
> (a) whatever which locks object store in the log would cause deadlock just as 
> well
> (b) which can be e.g., a fault fired, or lots of other perfectly valid things 
> to log.
> 
> I do understand why log4j methods accept generic Object and render the result 
> inside -- it is reasonable not to render anything unless we want to log with 
> the current setting.
> 
> What seems to me to be a grave bug though is that this rendering happens 
> under a lock, which causes a deadlock if two threads log concurrently and it 
> so happens that
> 
> (i) one of them logs under object store lock (which I believe is valid and 
> correct)
> (ii) the other's log contents causes an object store lock (which again I 
> believe is valid and correct, see (a) and (b) above).
> 
> So, well, is there indeed a grave bug inside of the ubiquitous log4j, or am I 
> overlooking something of importance?
> 
> And if there is a bug, is there anything better one can do to work around it 
> than using toStrings for all non-trivial logs, like 
> “log.info("non-trivial-contents".toString())”, which would affect efficiency 
> pretty bad?
> 
> Thanks and all the best,
> OC
> 
> === "model.ReportPDFTask@5f6ce9a5" slightly cleaned up ===
> - org.apache.log4j.Category.callAppenders(org.apache.log4j.spi.LoggingEvent) 
> @bci=12, line=204 (Compiled frame) // hangs on Monitor@0x000000012c0008a8, 
> which is held by [1] above
> - org.apache.log4j.Category.forcedLog(java.lang.String, 
> org.apache.log4j.Priority, java.lang.Object, java.lang.Throwable) @bci=14, 
> line=391 (Compiled frame)
> - org.apache.log4j.Category.info(java.lang.Object) @bci=38, line=666 
> (Compiled frame)
> - 
> er.extensions.eof.ERXEOAccessUtilities.logExpression(com.webobjects.eoaccess.EOAdaptorChannel,
>  com.webobjects.eoaccess.EOSQLExpression, long) @bci=406, line=1307 (Compiled 
> frame)
> - 
> er.extensions.eof.ERXAdaptorChannelDelegate.adaptorChannelDidEvaluateExpression(com.webobjects.eoaccess.EOAdaptorChannel,
>  com.webobjects.eoaccess.EOSQLExpression) @bci=76, line=81 (Compiled frame)
> - com.webobjects.foundation._NSDelegate.perform(java.lang.String, 
> java.lang.Object, java.lang.Object) @bci=14, line=163 (Compiled frame)
> - 
> er.extensions.jdbc.ERXJDBCAdaptor$Channel.evaluateExpression(com.webobjects.eoaccess.EOSQLExpression)
>  @bci=2, line=286 (Compiled frame)
> - 
> com.webobjects.eocontrol.EOObjectStoreCoordinator.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
>  com.webobjects.eocontrol.EOEditingContext) @bci=97, line=488 (Compiled 
> frame) // somewhere in this code probably object store is locked [2], which 
> sort of makes sense
> - 
> com.webobjects.eocontrol.EOEditingContext.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
>  com.webobjects.eocontrol.EOEditingContext) @bci=79, line=4069 (Interpreted 
> frame)
> - 
> er.extensions.eof.ERXEC.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
>  com.webobjects.eocontrol.EOEditingContext) @bci=10, line=1308 (Interpreted 
> frame)
> - model._DBReport.fetchDBReports(com.webobjects.eocontrol.EOEditingContext, 
> com.webobjects.eocontrol.EOQualifier, com.webobjects.foundation.NSArray) 
> @bci=103, line=141 (Interpreted frame) // my code which fetches
> ... the rest, I believe, is irrelevant ...
> ===
> 
> 
> 
> _______________________________________________
> Do not post admin requests to the list. They will be ignored.
> Webobjects-dev mailing list      (Webobjects-dev@lists.apple.com)
> Help/Unsubscribe/Update your Subscription:
> https://lists.apple.com/mailman/options/webobjects-dev/ocs%40ocs.cz
> 
> This email sent to o...@ocs.cz


 _______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list      (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Reply via email to