Hi all,
We use Virgo server and it comes with Logback 1.0.7.v20121108-1250. One of the
users of our product started complaining that after about a week of using our
product, it was becoming so slow that it was basically unusable. We had a live
session with the client and it turned out that the problem was in Logback. The
client uses Windows. So far the problem has not been reported on Linux. Here's
what I did to determine that Logback was the cause of the problem:
1. The appender is defined as:
<appender name="LOG_FILE"
class="ch.qos.logback.core.rolling.RollingFileAppender">
<file>D:/logs/product.log</file>
<rollingPolicy
class="ch.qos.logback.core.rolling.FixedWindowRollingPolicy">
<FileNamePattern>D:/logs/product_%i.log.zip</FileNamePattern>
<MinIndex>0</MinIndex>
<MaxIndex>9</MaxIndex>
</rollingPolicy>
<triggeringPolicy
class="ch.qos.logback.core.rolling.SizeBasedTriggeringPolicy">
<MaxFileSize>50MB</MaxFileSize>
</triggeringPolicy>
<encoder class="ch.qos.logback.classic.encoder.PatternLayoutEncoder">
<Pattern>[%d{yyyy-MM-dd'T'HH:mm:ss.SSSXXX}] [%-5level] %-28.28thread
%X{sessionGuid} %-64.64logger{64} %X{medic.eventCode} %msg %ex%n</Pattern>
</encoder>
</appender>
In the appender configuration above, I modified the directories and log names a
little bit, so that the actual product name would not be visible.
2. All the rolled-over zip files were generated and present, and the main log
file, product.log, had exceeded its maximum size of 50 MB and had reached 64
MB. We had something like:
product.log <------- 64 MB
product_0.log.zip
product_1.log.zip
product_2.log.zip
product_3.log.zip
product_4.log.zip
product_5.log.zip
product_6.log.zip
product_7.log.zip
product_8.log.zip
product_9.log.zip
3. This was suspicious, so I examined the log file and saw many messages that
were logged out of order, e.g.
09:55:43.676
09:55:42.836
09:55:46.165
09:55:44.583
...
Furthermore, even the simplest operations, e.g. simply sending a redirect
response back to the browser, was taking several seconds. (However, before
sending the redirect, we were logging the URL to which we were redirecting, so
I suspected that the logger was extremely slow.)
3. I moved the archived log files to a different directory and they got
recreated *immediately*. I did this 7-8 times, each time with exactly the same
result -- the .zip files got recreated immediately.
4. The main log, product.log, was locked by the JVM, so I couldn't move it or
delete it. However, I opened it with a text editor, selected the whole content
(Ctrl+A), deleted it (Del) and saved the file. As soon as I did that, the
performance problems disappeared. The product began working normally. Also, the
log file started growing with normal speed. Previously, it was growing
super-slowly.
So, we told the users to do the following workaround when the preconditions
from step 2 above were reached and the product became very slow:
(1) Open the main log file, select all, delete, save.
(2) Delete all archived .zip files
This workaround has been working fine so far. However, we still don't know why
Logback behaves like this. That's why I'm writing this email. Can someone, who
knows the Logback source code, figure out what might be happening and what
triggers it?
Here are some additional details that we've been able to figure out:
1. When the users were experiencing the problem during the live session with
them, I opened ProcessExplorer and saw that Splunk was holding a lock onto the
main log file -- product log. I asked the users to kill the Splunk service and
they did, but that didn't solve the problem. However, depending on the way
Logback is written, it may turn out that it indeed went crazy because of
Splunk, but *after* it goes crazy, killing the other process that hold a lock
on the file doesn't help.
2. We managed to reproduce the same behavior in our company by letting two
different appenders write to the same log file. Then, as soon as the
preconditions on step 2 above were reached, everything became extremely slow,
the logs couldn't roll over and the main log started growing beyond its
specified limit of 50 MB. However, the users do *not* have such a configuration
in which two appenders write to the same file. I'm only giving you information
on how we managed to reproduce the slowness in case it might help you figure
out what's wrong. The end result is the same -- Logback slows down to a crawl,
but apparently the trigger in our users' environment is different and it's not
related to appenders writing to the same file.
3. A thread dump while we were reproducing the problem (with two appenders
writing to the same file) reveals the following:
3.1. One thread holds a lock (0x00000000dfa5d600):
"Thread-48" #199 daemon prio=5 os_prio=0 tid=0x0000000024b61000 nid=0x772c
runnable [0x000000002422e000]
java.lang.Thread.State: RUNNABLE
at java.util.zip.Deflater.deflateBytes(Native Method)
at java.util.zip.Deflater.deflate(Deflater.java:442)
- locked <0x00000000fdb8dcf0> (a java.util.zip.ZStreamRef)
at java.util.zip.Deflater.deflate(Deflater.java:364)
at java.util.zip.DeflaterOutputStream.deflate(DeflaterOutputStream.java:251)
at java.util.zip.DeflaterOutputStream.write(DeflaterOutputStream.java:211)
at java.util.zip.ZipOutputStream.write(ZipOutputStream.java:331)
- locked <0x00000000fdb8da00> (a java.util.zip.ZipOutputStream)
at
ch.qos.logback.core.rolling.helper.Compressor.zipCompress(Compressor.java:110)
at
ch.qos.logback.core.rolling.helper.Compressor.compress(Compressor.java:58)
at
ch.qos.logback.core.rolling.FixedWindowRollingPolicy.rollover(FixedWindowRollingPolicy.java:147)
at
ch.qos.logback.core.rolling.RollingFileAppender.rollover(RollingFileAppender.java:111)
- locked <0x00000000dfa5d460> (a ch.qos.logback.core.spi.LogbackLock)
at
ch.qos.logback.core.rolling.RollingFileAppender.subAppend(RollingFileAppender.java:144)
- locked <0x00000000dfa5d600> (a
ch.qos.logback.core.rolling.SizeBasedTriggeringPolicy)
at
ch.qos.logback.core.OutputStreamAppender.append(OutputStreamAppender.java:103)
at
ch.qos.logback.core.UnsynchronizedAppenderBase.doAppend(UnsynchronizedAppenderBase.java:88)
at
ch.qos.logback.core.spi.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:48)
at ch.qos.logback.classic.Logger.appendLoopOnAppenders(Logger.java:280)
at ch.qos.logback.classic.Logger.callAppenders(Logger.java:267)
at ch.qos.logback.classic.Logger.buildLoggingEventAndAppend(Logger.java:449)
at ch.qos.logback.classic.Logger.filterAndLog_0_Or3Plus(Logger.java:403)
at ch.qos.logback.classic.Logger.log(Logger.java:803)
at
org.apache.commons.logging.impl.SLF4JLocationAwareLog.warn(SLF4JLocationAwareLog.java:185)
...
3.2. A multitude of threads are blocked and waiting to acquire that lock, e.g.:
"Thread-55" #206 daemon prio=5 os_prio=0 tid=0x000000001f50d000 nid=0x7288
waiting for monitor entry [0x00000000263ef000]
java.lang.Thread.State: BLOCKED (on object monitor)
at
ch.qos.logback.core.rolling.RollingFileAppender.subAppend(RollingFileAppender.java:142)
- waiting to lock <0x00000000dfa5d600> (a
ch.qos.logback.core.rolling.SizeBasedTriggeringPolicy)
at
ch.qos.logback.core.OutputStreamAppender.append(OutputStreamAppender.java:103)
at
ch.qos.logback.core.UnsynchronizedAppenderBase.doAppend(UnsynchronizedAppenderBase.java:88)
at
ch.qos.logback.core.spi.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:48)
at ch.qos.logback.classic.Logger.appendLoopOnAppenders(Logger.java:280)
at ch.qos.logback.classic.Logger.callAppenders(Logger.java:267)
at ch.qos.logback.classic.Logger.buildLoggingEventAndAppend(Logger.java:449)
at ch.qos.logback.classic.Logger.filterAndLog_0_Or3Plus(Logger.java:403)
at ch.qos.logback.classic.Logger.log(Logger.java:803)
at
org.apache.commons.logging.impl.SLF4JLocationAwareLog.warn(SLF4JLocationAwareLog.java:185)
...
I have attached the logging configuration from the Virgo server. It's called
serviceability.xml. (That's the default configuration file name for the Virgo
server.)
I'm sorry the mail turned out so long. I hope this will not discourage you from
reading it. I will be extremely grateful if you, guys, can help us figure out
what makes Logback go crazy.
<!--
Virgo log configuration
See http://www.eclipse.org/virgo/documentation/virgo-documentation-3.6.3.RELEASE/docs/virgo-user-guide/htmlsingle/virgo-user-guide.html#configuring-serviceability
-->
<configuration>
<contextListener class="ch.qos.logback.classic.jul.LevelChangePropagator"/>
<appender name="LOG_FILE" class="ch.qos.logback.core.rolling.RollingFileAppender">
<file>D:/logs/product.log</file>
<rollingPolicy class="ch.qos.logback.core.rolling.FixedWindowRollingPolicy">
<FileNamePattern>D:/logs/product_%i.log.zip</FileNamePattern>
<MinIndex>0</MinIndex>
<MaxIndex>9</MaxIndex>
</rollingPolicy>
<triggeringPolicy class="ch.qos.logback.core.rolling.SizeBasedTriggeringPolicy">
<MaxFileSize>50MB</MaxFileSize>
</triggeringPolicy>
<encoder class="ch.qos.logback.classic.encoder.PatternLayoutEncoder">
<Pattern>[%d{yyyy-MM-dd'T'HH:mm:ss.SSSXXX}] [%-5level] %-28.28thread %X{sessionGuid} %-64.64logger{64} %X{medic.eventCode} %msg %ex%n</Pattern>
</encoder>
</appender>
<appender name="EVENT_LOG_STDOUT" class="org.eclipse.virgo.medic.log.logback.ReroutingAwareConsoleAppender">
<encoder class="ch.qos.logback.classic.encoder.PatternLayoutEncoder">
<Pattern>[%d{yyyy-MM-dd'T'HH:mm:ss.SSSXXX}] [%-5level] %-28.28thread %X{sessionGuid} <%X{medic.eventCode}> %msg %ex%n</Pattern>
</encoder>
</appender>
<!-- eventlog.log is a small log recording only important events in Virgo, i.e. bundle activities, etc. -->
<appender name="EVENT_LOG_FILE" class="ch.qos.logback.core.rolling.RollingFileAppender">
<file>D:/logs/eventlog.log</file>
<rollingPolicy class="ch.qos.logback.core.rolling.FixedWindowRollingPolicy">
<FileNamePattern>D:/logs/eventlog_%i.log.zip</FileNamePattern>
<MinIndex>1</MinIndex>
<MaxIndex>4</MaxIndex>
</rollingPolicy>
<triggeringPolicy class="ch.qos.logback.core.rolling.SizeBasedTriggeringPolicy">
<MaxFileSize>10MB</MaxFileSize>
</triggeringPolicy>
<encoder class="ch.qos.logback.classic.encoder.PatternLayoutEncoder">
<Pattern>[%d{yyyy-MM-dd'T'HH:mm:ss.SSSXXX}] [%-5level] %-28.28thread %X{sessionGuid} <%X{medic.eventCode}> %msg %ex%n</Pattern>
</encoder>
</appender>
<!-- query_profile_dataservice.log is only used for debugging -->
<appender name="DS_QUERY_PROFILE_FILE" class="ch.qos.logback.core.rolling.RollingFileAppender">
<file>sD:/logs/query_profile_dataservice.log</file>
<rollingPolicy class="ch.qos.logback.core.rolling.FixedWindowRollingPolicy">
<FileNamePattern>D:/logs/query_profile_dataservice_%i.log.zip</FileNamePattern>
<MinIndex>0</MinIndex>
<MaxIndex>9</MaxIndex>
</rollingPolicy>
<triggeringPolicy class="ch.qos.logback.core.rolling.SizeBasedTriggeringPolicy">
<MaxFileSize>50MB</MaxFileSize>
</triggeringPolicy>
<filter class="ch.qos.logback.core.filter.EvaluatorFilter">
<evaluator class="ch.qos.logback.classic.boolex.OnMarkerEvaluator">
<marker>DS_QUERY_PROFILE</marker>
</evaluator>
<onMismatch>DENY</onMismatch>
<onMatch>NEUTRAL</onMatch>
</filter>
<encoder>
<pattern>%msg%n@!@!@</pattern>
</encoder>
</appender>
<root level="INFO">
<appender-ref ref="LOG_FILE" />
</root>
<appender name="PRODUCTMESSAGING_FILE" class="ch.qos.logback.core.rolling.RollingFileAppender">
<file>D:/logs/productmessaging.log</file>
<rollingPolicy class="ch.qos.logback.core.rolling.FixedWindowRollingPolicy">
<FileNamePattern>D:/logs/productmessaging_%i.log.zip</FileNamePattern>
<MinIndex>0</MinIndex>
<MaxIndex>9</MaxIndex>
</rollingPolicy>
<triggeringPolicy class="ch.qos.logback.core.rolling.SizeBasedTriggeringPolicy">
<MaxFileSize>50MB</MaxFileSize>
</triggeringPolicy>
<encoder>
<Pattern>[%d{yyyy-MM-dd HH:mm:ss.SSS}] [%-5level] %-28.28thread %msg %ex%n</Pattern>
</encoder>
</appender>
<logger level="DEBUG" additivity="true" name="com.acme.messaging">
<appender-ref ref="PRODUCTMESSAGING_FILE" />
</logger>
<logger level="INFO" additivity="true" name="com.acme.data.query.impl.DataServiceImpl">
<appender-ref ref="DS_QUERY_PROFILE_FILE" />
</logger>
<logger level="INFO" additivity="false" name="org.eclipse.virgo.medic.eventlog.localized">
<appender-ref ref="EVENT_LOG_STDOUT" />
<appender-ref ref="EVENT_LOG_FILE" />
</logger>
<logger level="INFO" additivity="false" name="org.eclipse.virgo.medic.eventlog.default">
<appender-ref ref="LOG_FILE" />
</logger>
<logger level="ERROR" additivity="false" name="com.acme.ph.client.common.extensions">
<!-- only error messages from phonehome client extensions are necessary -->
<appender-ref ref="LOG_FILE" />
</logger>
</configuration>
_______________________________________________
logback-user mailing list
[email protected]
http://mailman.qos.ch/mailman/listinfo/logback-user