https://issues.apache.org/bugzilla/show_bug.cgi?id=41214
--- Comment #44 from msm <[email protected]> --- Created attachment 29064 --> https://issues.apache.org/bugzilla/attachment.cgi?id=29064&action=edit Log4j bug deadlock fix and reproduction (Bug 41214) Hi Apache Log4j dev team, I am attaching the Bug41214.zip file. Inside this file there are two other zip files: Bug41214_Reproduction.zip and Log4j_1.2.17_Bug41214_Fix.zip. I am submitting a patch for bug 41214 (https://issues.apache.org/bugzilla/show_bug.cgi?id=41214), in the Log4j_1.2.17_Bug41214_Fix.zip attached file. Please, read below my explanation: Actually, I think the solution (or a mitigation) might be simple. I suggest the following "workaround" modifications, in the Log4J version 1.2.17 (the last version): 1)In the org.apache.log4j.spi.LoggingEvent class, inside all the constructors: Replace: this.message = message; By: this.message = message != null ? message.toString() : null; Doing so, you can make the "message" instance field always of the type String. 2)In the org.apache.log4j.spi.ThrowableInformation class: 2.1)Create the following instance field: private final String[] stackTraceArray; And, inside all the constructors, put the following piece of code: this.stackTraceArray = getThrowableStrRep(); The intention here is using "stackTraceArray" instance field instead of calling getThrowableStrRep() later. 2.2)Make getThrowableStrRep() private. Doing so, there is no need to maintain the synchronization of this method. Actually, I think there would be no need of synchronizing this method even if it were public, because I didn't see any place in Log4j source codes where an instance of ThrowableInformation is being shared by more than one thread. I do not understand what was the intention of the developer that created that method, maybe he/she predicted a future necessity of synchronization, as ThrowableInformation is public and has public constructors. We never know the sort of things the other programmers will do with our public classes and methods (the "bug in Log4j" we are discussing here shows exactly such a situation, doesn't it?) 2.3)Create a public method called "getStackTraceArray()" that returns "stackTraceArray" instance field (or we can also return a clone of this array). 3)In all the places where getThrowableStrRep() is being called, replace by getStackTraceArray() invocation. This replacement should be done in the following classes: org.apache.log4j.pattern.ThrowableInformationPatternConverter org.apache.log4j.pattern.LogEvent org.apache.log4j.lf5.Log4JLogRecord I do not think it is related just to RollingFileAppender, as the title of this page suggests. That's just a synchronization among threads problem, not diretly related to any appender. The idea behind this solution is based on the fact that inside toString() method of the String class there isn't any Log4j call. Trying to understand who is the "guilty" of the problem (whether is Log4j or the application/framework that uses Log4j) is not so important. What is revelant is the fact that we have a big serious problem that has to be solved. Otherwise, we can never rely on a multithreaded application subject to a Log4j deadlock. If the solution I proposed doesn't solve the problem, I will really consider the alternative of removing Log4j from my application. I will sadly do that, although I like Log4j so much. But I tested my solution, and I think it works and covers all the critical cases. Of course, as I said, this solution might be considered just a mitigation, not a real solution. In my opinion, a real and elegant solution would be done through a more deep refactoring. I also want to tell that this problem really happens, this is not just theoretical. I have a multithreaded application that uses Apache Mina framework for exchanging bytes with another multithreaded application. Sometimes the application simply stops working. I didn't understand why some threads remain BLOCKED until I have found the web page of the bug 41214. This web page was my salvation. In the Bug41214_Reproduction.zip attached file, you have the source codes of an executable that reproduces the bug. First, try running this executable with log4j version 1.2.17 without the fix suggested by me. You see a deadlock happening. Try using jconsole to see what's going on. You'll see in jconsole something like this: ============== Name: AnExceptionThread State: BLOCKED on org.apache.log4j.ConsoleAppender@13e75a5 owned by: AnObjectThread Total blocked: 2 Total waited: 0 Stack trace: org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:231) org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66) org.apache.log4j.Category.callAppenders(Category.java:206) - locked org.apache.log4j.Logger@147e668 org.apache.log4j.Category.forcedLog(Category.java:391) org.apache.log4j.Category.info(Category.java:683) com.d.AnExceptionThread.run(AnExceptionThread.java:19) ============== ============== Name: AnObjectThread State: BLOCKED on org.apache.log4j.spi.RootLogger@157b46f owned by: RootLoggerThread Total blocked: 1 Total waited: 1 Stack trace: org.apache.log4j.Category.callAppenders(Category.java:205) org.apache.log4j.Category.forcedLog(Category.java:391) org.apache.log4j.Category.info(Category.java:666) com.a.AnObject.toString(AnObject.java:24) org.apache.log4j.or.DefaultRenderer.doRender(DefaultRenderer.java:37) org.apache.log4j.or.RendererMap.findAndRender(RendererMap.java:80) org.apache.log4j.spi.LoggingEvent.getRenderedMessage(LoggingEvent.java:368) org.apache.log4j.helpers.PatternParser$BasicPatternConverter.convert(PatternParser.java:402) org.apache.log4j.helpers.PatternConverter.format(PatternConverter.java:65) org.apache.log4j.PatternLayout.format(PatternLayout.java:506) org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:310) org.apache.log4j.WriterAppender.append(WriterAppender.java:162) org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251) - locked org.apache.log4j.ConsoleAppender@13e75a5 org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66) org.apache.log4j.Category.callAppenders(Category.java:206) - locked org.apache.log4j.Logger@12c4c57 org.apache.log4j.Category.forcedLog(Category.java:391) org.apache.log4j.Category.info(Category.java:666) com.c.AnObjectThread.run(AnObjectThread.java:20) ============== ============== Name: RootLoggerThread State: BLOCKED on org.apache.log4j.ConsoleAppender@13e75a5 owned by: AnObjectThread Total blocked: 1 Total waited: 0 Stack trace: org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:231) org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66) org.apache.log4j.Category.callAppenders(Category.java:206) - locked org.apache.log4j.spi.RootLogger@157b46f org.apache.log4j.Category.forcedLog(Category.java:391) org.apache.log4j.Category.info(Category.java:666) com.main.RootLoggerThread.run(RootLoggerThread.java:18) After applying the fix I suggest, no more deadlock happens. ============== Thanks in advance Marcelo Sussumu Miashiro. -- You are receiving this mail because: You are the assignee for the bug. --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
