Re: Log4cxx wilting under heavy load

Curt Arnold Fri, 28 Sep 2007 15:11:23 -0700


On Sep 28, 2007, at 4:27 PM, BUNBURY Myles wrote:

Log4cxx seems to wilt when under heavy load.
With the syslog appender enabled, the application cores when underheavy load.
When using a RollingFileAppender with an XMLLayout layout, myoutput file gets garbled:
<<<snip>>>
<log4j:event logger="Gateway.AuthenticationServices/ANS_AuthenticationService.cpp" timestamp="1191005220721"level="INFO" thread="0x61eb3b90"><log4j:message><![CDATA[No username/password token or SAMLassertion to process.]]></log4j:message>
</log4j:event>
<log4j:event logger="Gateway.MediationPipeline/MED_PipelineMessage.cpp" timestamp="1191005220721" level="INFO"thread="0x61eb3b90"><log4j:message><![CDATA[No session policy set for user/group .]]></log4j:message>
</log4j:event>
<log4j:event logger="Gateway.MediationPipeline/MED_PipelineMessage.cpp" timestamp="1191005220721" level="DEBUG"thread="0x61eb3b90"><log4j:message><![CDATA[Using session #2 with session policyDefaultPolicy.]]></log4j:message>
</log4j:event>
<log4j:event logger="Gateway.SessionInfra/SI_SessionManager.cpp"timestamp="1191005220729" level="WARN" thread="0x5ecaeb90"><log4j:message><![CDATA[Concurrent session limit reached.Transaction is rejected.]]></log4j:message>
</log4j:event>
<log4j:event logger="Gateway.SessionInfra/SI_SessionManager.cpp"timestamp="1191005220729" level="ERROR" thread="0x5ecaeb90"><log4j:message><![CDATA[User session rejected - have exceeded themaximum number of concurrent sessions based on the current license(0 concurrent sessions allowed)]]></log4j:message>
</log4j:event>
¦«    "          ´T#
ýÍÊ^˜ÌÊ^Ä        °Ñ"
ôÌÊ^µ   ‡·<T#
ÜÑ"
            XÍÊ^       <T#
ÜÑ"
À4‡·    dÐÊ^ÍÊ^§c    ´T#
    ð‡·°Ñ"
@ÍÊ^1Ö†· T#
¸Ñ"
            XÍÊ^           ôO ÔÑ"
        |ó³            hÍÊ^
   Þ$z  XÍÊ^p÷³ xÍÊ^/&L4#
z  ¸ÍÊ^Z;'°Ñ"
    ¸ÍÊ^ü(H4#
ðÍÊ^¸ÍÊ^13 Ì€ðÍÊ^ðÍÊ^
Þ$z ÏÊ^ÄydÐÊ^ðÍÊ^ ðÍÊ^ÏÊ^PÏÊ^XÏÊ^h ·‹·1191‘À‹·20729&ôï˜· €¼ÏÊ^(ÎÊ^®Ý‹· €È€—ÎÊ^H€¼ÏÊ^H €@ € ôï˜·5 À€ €À€ ¼ÏÊ^xÎÊ^ €Ðõ‹··‹·ôO À€ ˆÎÊ^@ €‹‚&·‹· X~—·¨ÎÊ^5
´ dÐÊ^Þ$¸ÎÊ^ H&    çÎÊ^ÏÊ^^¥!øÎÊ^°#
ÏÊ^§c %ôï˜·èÎÊ^ðÆ0ÐÊ^'ÏÊ^øÎÊ^ÙÃ'ÏÊ^'ÏÊ^(ÏÊ^žÇ'ÏÊ^'ÏÊ^SÏÊ^Í¦!\sA
dÐÊ^(ÏÊ^%LRÏÊ^´ÏÊ^XÏÊ^ÖÇRÏÊ^SÏÊ^RÏÊ^¶0 \sA
kÏÊ^XÏÊ^¶&$HsA
dÐÊ^ÈÏÊ^¦«0ÐÊ^Þ$xÏÊ^íÃHÐÊ^ÿÿÿÿˆÏÊ^ÄHÐÊ^l4¸ÏÊ^6ÉHÐÊ^ó ôO l4z ÈÏÊ^e4 HÐÊ^dÐÊ^ÈÏÊ^uÉHÐÊ^Þ[EMAIL PROTECTED] HÒÊ^ÑÊ^ÿÿÿÿ‘â³ Z;'°Ñ"j 8ÐÊ^ü(0x5ecaeb90Ê^13 4€pÐÊ^pÐÊ^ Þ$zHÐÊ^Í&ä6%
        4ÐÊ^4ÐÊ^    ˆÐÊ^
*à6%
LÐÊ^LÐÊ^    Üí#
Ì€˜ÐÊ^<log4j:event logger="Gateway.SessionInfra/SI_SessionManager.cpp" timestamp="1191005220729" level="WARN"thread="0x600b0b90"><log4j:message><![CDATA[Concurrent session limit reached.Transaction is rejected.]]></log4j:message>
</log4j:event>
<log4j:event logger="Gateway.MediationPipeline/MED_PipelineMessage.cpp" timestamp="1191005220729" level="INFO"thread="0x5ecaeb90"><log4j:message><![CDATA[Auth'ed user hitting dend. Transaction isrejected.]]></log4j:message>
</log4j:event>
<log4j:event logger="Gateway.MediationPipeline/MED_PipelineMessage.cpp" timestamp="1191005220729" level="INFO"thread="0x5ecaeb90"><log4j:message><![CDATA[Auth'ed user hitting deny-all ws]]></log4j:message>
</log4j:event>
<log4j:event logger="Gateway.SessionInfra/SI_SessionManager.cpp"timestamp="1191005220729" level="ERROR" thread="0x600b0b90"><log4j:message><![CDATA[User session rejected - have exceeded themaximum number of concurrent sessions based on the current license(0 concurrent sessions allowed)]]></log4j:message>
</log4j:event>

<<<snip>>>
This output was generated with ~150 threads throwing out errormessages in rapid succession.
I have not noted these problems when treading lightly on the system.
At this point I'm suspicious that log4cxx is overflowing some sortof output buffer that's fills up due to the volume. I don't believeit is a multithreading issue as I serialized the logging calls inmy code with a mutex and did not note any change.
Has anyone else noted something similiar?
How is log4cxx supposed to behave when the logging volume exceedsits capacity to output those events?

There was a similar symptoms reported in https://issues.apache.org/jira/browse/LOGCXX-186. That ended up being due to attempting toconfigure log4cxx simultaneously on multiple threads. I assume thatis not the case with you, but could you confirm that the problempersists if you configure and attempt a logging request prior tolaunching threads (assuming that your app is the one launching thethreads).

If that is not the case, then the previous bug report might beinstructive to try to identify and fix the problem. The problemsmight be related, but perhaps not. I would focus on trying toidentify the one that is most significant to you and then see if anyresolution fixed the other.


I would suggest:

1. Make sure that you are using the most current SVN HEAD.

2. Making sure that configuration and some initial log requests occurbefore threads are launched. (previously mentioned).

3. Wrap the failing appender in the soon to be committed rewrite ofAsyncAppender (if not in the next hour or so, at least by mid-daytomorrow). AsyncAppender has been badly flawed. I'm just about thecommit a port of the log4j 1.2.14 AsyncAppender (which replaced anearlier log4j implementation that was badly flawed too). That shouldhave the affect of bringing all appends on the underlying appenderdown to a common thread. Plus it would be a great stress test of allmy new code (thanks for volunteering for testing).

4. Simplify, simplify. There are a couple of places that corruptionmay be occurring. It could be an Appender problem, a Layout problem,a CharsetEncoder problem, a CharsetDecoder problem or somecombination. Trying to isolate the problem makes resolution a loteasier.a) Setting LOG4CXX_LOGCHAR_IS_UTF8 would replace an conversionbetween the local code page and UTF-8 with a straight copy.Eliminating a lot of complexity and potential source of problems.b) Replace the layout (if there is one) with something simpler.First try simplifying the pattern as much as possible like %m%n.c) See if the problem occurs with the simpler appenders(FileAppender instead of RollingFileAppender) or other unrelatedappenders.

5. Attempt to recreate the problem in a sample app that you canattach to a bug report. Please specify all specifics of theconfigurations (operating system, compiler, type and number ofprocessors, etc) that have been tested and whether they failconsistently, have not been observed to fail or somewhere in between.

6. If you are running on a x86 Unix, valgrind could be useful toidentify potential problems. I used the helgrind tool in earlierversions of valgrind (its been broken for a while, so had to build itfrom source) to identify potential race conditions. Other toolsmight be useful to identify the problem.

Re: Log4cxx wilting under heavy load

Reply via email to