Hmmm, looks like we are not the only one who got bit by that thing in tender parts: http://stackoverflow.com/a/7397857
> On 25. 10. 2016, at 10:59 PM, OC <o...@ocs.cz> wrote: > > Chuck, > > On 24. 10. 2016, at 18:23, Chuck Hill <ch...@gevityinc.com> wrote: > >> Running “sudo jstack –F <process id>” should dump a trace of all threads. >> Should… > > Luckily, it did; just it is sort of weird that it identifies threads by name > in the deadlock report, but by some unknown ID (perhaps hash of the Thread > instance, looks like that) in the traces, which makes for sort of difficult > parsing. I guess I'm not the first one to swear :) > >> ... As for the hang, the code in the logging does seem like a likely culprit. > > Actually it looks like the culprit is log4j itself?!? What the heck! Anyroad, > here's the deadlock report: > > === > "model.ReportPDFTask@5f6ce9a5": > waiting to lock Monitor@0x000000012c0008a8 (Object@0x00000003f7efb980, a > org/apache/log4j/spi/RootLogger), > which is held by "WorkerThread6" > "WorkerThread6": > waiting for ownable synchronizer 0x00000003f9538960, (a > java/util/concurrent/locks/ReentrantLock$NonfairSync), > which is held by "model.ReportPDFTask@5f6ce9a5" > === > > the WorkerThread6 problem happens in the “log.info("$tch")” part, it looks > like this (and although changesFromCommittedSnapshot happens to be part of > the culprit, I believe the problem is actually caused by the logger itself): > instead of rendering its argument immediately, as soon as it is determined > that we should log, it does that self-evidently under a lock: > > === Worker6 (cleaned up considerably) === > - er.extensions.eof.ERXEC.lockObjectStore() // the lock W6 [2] hangs on, > caused by... > - er.extensions.eof.ERXGenericRecord.changesFromCommittedSnapshot() // ... > changesFromCommittedSnapshot ... > - cz.ocs.model.OCSEnterpriseObject.toString() // ... which yours truly dumbly > uses in toString, alas! > - org.apache.log4j.or.DefaultRenderer.doRender(java.lang.Object) // ... this > thing happens INSIDE of log4j code > ... lot of frames here; self-evidently, some of them acquires and holds > Monitor@0x000000012c0008a8 [1] ... > - org.apache.log4j.Category.info(java.lang.Object) // this is the > log.info("$tch") of mine > ... the loop and the other standard stuff up to dispatchRequest and > WOWorkerThread.run() ... > === > > The "model.ReportPDFTask" actually does not do anything wrong (far as I can > say) -- it just fetches. A fetch presumably (and understandably) would lock > the object store. With locked object store, it... logs! (I do not think the > stack trace is important here; anyway, I have attached it at the end of this > message for reference.) > > Nothing wrong there I believe, it feels right to be able to log with a locked > object store; but it causes deadlock, for the log tries to acquire its > Monitor@0x000000012c0008a8 -- which is held by Worker6. > > Now, although in my case the culprit happens to be unneeded (and generally > dangerous) changesFromCommittedSnapshot, I believe that > > (a) whatever which locks object store in the log would cause deadlock just as > well > (b) which can be e.g., a fault fired, or lots of other perfectly valid things > to log. > > I do understand why log4j methods accept generic Object and render the result > inside -- it is reasonable not to render anything unless we want to log with > the current setting. > > What seems to me to be a grave bug though is that this rendering happens > under a lock, which causes a deadlock if two threads log concurrently and it > so happens that > > (i) one of them logs under object store lock (which I believe is valid and > correct) > (ii) the other's log contents causes an object store lock (which again I > believe is valid and correct, see (a) and (b) above). > > So, well, is there indeed a grave bug inside of the ubiquitous log4j, or am I > overlooking something of importance? > > And if there is a bug, is there anything better one can do to work around it > than using toStrings for all non-trivial logs, like > “log.info("non-trivial-contents".toString())”, which would affect efficiency > pretty bad? > > Thanks and all the best, > OC > > === "model.ReportPDFTask@5f6ce9a5" slightly cleaned up === > - org.apache.log4j.Category.callAppenders(org.apache.log4j.spi.LoggingEvent) > @bci=12, line=204 (Compiled frame) // hangs on Monitor@0x000000012c0008a8, > which is held by [1] above > - org.apache.log4j.Category.forcedLog(java.lang.String, > org.apache.log4j.Priority, java.lang.Object, java.lang.Throwable) @bci=14, > line=391 (Compiled frame) > - org.apache.log4j.Category.info(java.lang.Object) @bci=38, line=666 > (Compiled frame) > - > er.extensions.eof.ERXEOAccessUtilities.logExpression(com.webobjects.eoaccess.EOAdaptorChannel, > com.webobjects.eoaccess.EOSQLExpression, long) @bci=406, line=1307 (Compiled > frame) > - > er.extensions.eof.ERXAdaptorChannelDelegate.adaptorChannelDidEvaluateExpression(com.webobjects.eoaccess.EOAdaptorChannel, > com.webobjects.eoaccess.EOSQLExpression) @bci=76, line=81 (Compiled frame) > - com.webobjects.foundation._NSDelegate.perform(java.lang.String, > java.lang.Object, java.lang.Object) @bci=14, line=163 (Compiled frame) > - > er.extensions.jdbc.ERXJDBCAdaptor$Channel.evaluateExpression(com.webobjects.eoaccess.EOSQLExpression) > @bci=2, line=286 (Compiled frame) > - > com.webobjects.eocontrol.EOObjectStoreCoordinator.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification, > com.webobjects.eocontrol.EOEditingContext) @bci=97, line=488 (Compiled > frame) // somewhere in this code probably object store is locked [2], which > sort of makes sense > - > com.webobjects.eocontrol.EOEditingContext.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification, > com.webobjects.eocontrol.EOEditingContext) @bci=79, line=4069 (Interpreted > frame) > - > er.extensions.eof.ERXEC.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification, > com.webobjects.eocontrol.EOEditingContext) @bci=10, line=1308 (Interpreted > frame) > - model._DBReport.fetchDBReports(com.webobjects.eocontrol.EOEditingContext, > com.webobjects.eocontrol.EOQualifier, com.webobjects.foundation.NSArray) > @bci=103, line=141 (Interpreted frame) // my code which fetches > ... the rest, I believe, is irrelevant ... > === > > > > _______________________________________________ > Do not post admin requests to the list. They will be ignored. > Webobjects-dev mailing list (Webobjects-dev@lists.apple.com) > Help/Unsubscribe/Update your Subscription: > https://lists.apple.com/mailman/options/webobjects-dev/ocs%40ocs.cz > > This email sent to o...@ocs.cz _______________________________________________ Do not post admin requests to the list. They will be ignored. Webobjects-dev mailing list (Webobjects-dev@lists.apple.com) Help/Unsubscribe/Update your Subscription: https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com This email sent to arch...@mail-archive.com