In case others are interested:

I raised the thread-pool thread-max to 300 and the server-default thread-max to 
1024.

I'm still interested in hearing if and how they are connected, but for now 
we've hadn't seen the freeze for over a week.

Regards,
Jens Dueholm Christensen
Rambøll Survey IT

From: resin-interest-boun...@caucho.com 
[mailto:resin-interest-boun...@caucho.com] On Behalf Of Jens Dueholm 
Christensen (JEDC)
Sent: Friday, March 02, 2012 10:11 AM
To: General Discussion for the Resin application server
Subject: Re: [Resin-interest] Problems with rollover of logfile with log4j 
1.2.16 and AsyncAppender

Hi Scott

Thanks for that suggestion.

However I'm a bit confused about what thread-max to change - our resin.conf 
looks like this:

<resin>
  <thread-pool>
    <thread-max>150</thread-max>
    <spare-thread-min>5</spare-thread-min>
  </thread-pool>
  <cluster-default>
    ...
  </cluster-default>

  <cluster id="xact-cluster">
    <server-default>
      <thread-max>512</thread-max>
      <thread-idle-min>10</thread-idle-min>
      <thread-idle-max>20</thread-idle-max>
      <keepalive-max>800</keepalive-max>
      <keepalive-timeout>10s</keepalive-timeout>
    </server-default>
  </cluster>
</resin>

The documentation is IMO not quite clear on how the "thread-pool thread-max" is 
connected to a "server-default thread-max", so a reccomendation would be 
appriciated.

Regards,
Jens Dueholm Christensen
Rambøll Survey IT
From: resin-interest-boun...@caucho.com 
[mailto:resin-interest-boun...@caucho.com] On Behalf Of Scott Ferguson
Sent: Thursday, March 01, 2012 7:00 PM
To: resin-interest@caucho.com
Subject: Re: [Resin-interest] Problems with rollover of logfile with log4j 
1.2.16 and AsyncAppender

Increase your thread-max for now. The 4.0.26 thread redesign will avoid this 
problem.

Basically, the log rollover is asking for a new thread at the same time as 
you're hitting thread-max, so it's getting stuck.

-- Scott


On 02/28/2012 08:37 AM, Scott Ferguson wrote:
On 02/28/2012 01:27 AM, Jens Dueholm Christensen (JEDC) wrote:
Hej Knut

Thanks for the link. In my somewhat extensive googling I havn't seen that page 
before.

It seems to confirm what is beeing said on 
http://www.simonsite.org.uk/#activeasync about fallback to synchronous logging 
if an exception is thrown.

This brings an interesting question to mind:

Can Resins logrotation somehow provoke an exception at the moment the logfiles 
are rotated, that causes log4j to fall back to synchronous logging?
Ie. if a buffer is attempted flushed while the backing file is unavailable?
Or the backend file is closed and reopened during a flush?

It should just be delayed, presumably for longer than some assumption of log4j.


The way log4j all of a sudden blocks (and with full buffers that are never 
flushed) seems to indicate that it has encountered an exception of some kind, 
so again I'd be happy for any comments from Caucho on this.

Well, it's a log4j issue, so I can only comment after I've debugged their code. 
(And the upcoming 4.0.26 release means that's still a day or two away.)

-- Scott



Oh yes - last night we saw it happen again. Our world ground to a halt with 
stacktraces showing the exact same as those in my original post. This was after 
a quiet weekend where we didn't encounter the problem once.

Regards,
Jens Dueholm Christensen
Rambøll Survey IT

From: 
resin-interest-boun...@caucho.com<mailto:resin-interest-boun...@caucho.com> 
[mailto:resin-interest-boun...@caucho.com] On Behalf Of Knut Forkalsrud
Sent: Monday, February 27, 2012 7:33 PM
To: General Discussion for the Resin application server
Subject: Re: [Resin-interest] Problems with rollover of logfile with log4j 
1.2.16 and AsyncAppender

You may find the discussion in 
http://glueclue.blogspot.com/2007/01/log4j-asyncappender-is-not-always_31.html 
useful.

Knut Forkalsrud


On Mon, Feb 27, 2012 at 00:16, Jens Dueholm Christensen (JEDC) 
<jens.dueh...@r-m.com<mailto:jens.dueh...@r-m.com>> wrote:
Bump? :)

Regards,
Jens Dueholm Christensen
Rambøll Survey IT

-----Original Message-----
From: 
resin-interest-boun...@caucho.com<mailto:resin-interest-boun...@caucho.com> 
[mailto:resin-interest-boun...@caucho.com<mailto:resin-interest-boun...@caucho.com>]
 On Behalf Of Jens Dueholm Christensen (JEDC)
Sent: Tuesday, February 21, 2012 9:01 PM
To: General Discussion for the Resin application server
Subject: Re: [Resin-interest] Problems with rollover of logfile with log4j 
1.2.16 and AsyncAppender

Hi Scott

I'd really appreciate it, if you took the time to do that - I (and my 
coworkers) really quite baffled.

We have had an increase of activity and load on the system during the last few 
weeks (more users and utilization, so nothing alarming), and now this problem 
has manifested itself, and I've never seen it before.

My understanding of what's happening is that the blocking thread is waiting for 
an empty buffer so it can flush it's own full buffer, but that notification 
never comes.

As the source for log4j shows the thread has called wait() on an ArrayList (the 
full buffer), and something has gone wrong, but what?

We've seen it happen 3 or 4 times during the last week - always a few minutes 
after midnight when Resin performs its logrotation.

In the stacktrace most of the threads that waits for the deadlocked thread is 
just (as is the deadlocked thread) doing a simple 
logger.info<http://logger.info>(String), so it might be possible to reproduce 
by spawning 10-20 threads that calls logger.info<http://logger.info>() and then 
have Resin perform a rollover once a minute.

Regards,
Jens Dueholm Christensen
Rambøll Survey IT

________________________________
From: 
resin-interest-boun...@caucho.com<mailto:resin-interest-boun...@caucho.com> 
[resin-interest-boun...@caucho.com<mailto:resin-interest-boun...@caucho.com>] 
On Behalf Of Scott Ferguson [f...@cauchomail.com<mailto:f...@cauchomail.com>]
Sent: 21 February 2012 19:01
To: resin-interest@caucho.com<mailto:resin-interest@caucho.com>
Subject: Re: [Resin-interest] Problems with rollover of logfile with log4j 
1.2.16 and AsyncAppender

On 02/20/2012 12:56 PM, Jens Dueholm Christensen (JEDC) wrote:

Hi

Lately we are seeing increased blocking behaviour from log4j when Resin 
performs the nightly rollover of logfiles.

We're running Resin Pro 3.1 (license #1013826) in a pre .11 snapshot 
(resin-pro-3.1.s110225) (due to bug #4349 and a bit of waiting before 3.1.11 
was released).
Updating to 3.1.12 is planned, but the changelog 
(http://www.caucho.com/resin-3.1/changes/changes.xtp is the best I could find?) 
does not mention any fix for our current problem, so I'm looking for any 
insights or advise.


I'd need to look at the log4j code to understand what it's doing.

I don't think that wait() should be related to Resin (other than timing) but 
I'd need to look at their code to be certain, and I can't think of any logging 
that would help. (Logging the logging code is tricky :)

3.1.13 should be out next week, by the way.

-- Scott


We've configured resin to perform rollovers of this logfile:

     <stdout-log path="/usr/local/www/xact/logs/resin-web-stdout.log"
       timestamp="[%Y-%m-%d %H:%M:%S.%s]" rollover-period="1D"/>

In our log4j configuration we log to stdout with:

  <appender name="CONSOLE" class="org.apache.log4j.ConsoleAppender">
     <param name="Target" value="System.out"/>
     <layout class="org.apache.log4j.PatternLayout">
        <param name="ConversionPattern" value="%d{ISO8601} [%-20.20t] %5p %c{1} 
- %m - %x%n"/>
     </layout>
  </appender>

  <appender name="ASYNC" class="org.apache.log4j.AsyncAppender">
     <appender-ref ref="CONSOLE"/>
  </appender>

  <root>
     <priority value="warn"/>
     <appender-ref ref="ASYNC"/>
  </root>

  <<loggers>>

Some time after midnight when Resin performs the rollover, the JVM becomes 
unresponsive and a stacktrace shows multiple threads hanging, and we have to 
manually restart (it doesn't get picked up by the watchdog, and can/will hang 
for hours).

A complete JVM stacktrace is attached to this mail.

All threads that are attempting to log are waiting for access to the object 
blocked by this thread:

"http-172.27.80.36:8080-30$1663241944" daemon prio=10 tid=0x00007f8804005800 
nid=0x2a40 in Object.wait() [0x00007f87a08ed000]
  java.lang.Thread.State: WAITING (on object monitor)
          at java.lang.Object.wait(Native Method)
          at java.lang.Object.wait(Object.java:485)
          at org.apache.log4j.AsyncAppender.append(AsyncAppender.java:195)
          - locked <0x00000005ba882928> (a java.util.ArrayList)
          at 
org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
          - locked <0x00000005b8d6db00> (a org.apache.log4j.AsyncAppender)
          at 
org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
          at org.apache.log4j.Category.callAppenders(Category.java:206)
          - locked <0x00000005ba6f9e38> (a org.apache.log4j.spi.RootLogger)
          at org.apache.log4j.Category.forcedLog(Category.java:391)
          at 
org.apache.log4j.Category.info<http://org.apache.log4j.Category.info>(Category.java:666)
          at 
com.pls.popinhandler.PopinScriptHandler.handleRequest(PopinScriptHandler.java:65)
          .

It seems like something never never calls notifyAll() on the ArrayList, and as 
a result our world grinds to a halt..

https://issues.apache.org/bugzilla/show_bug.cgi?id=38137#c16 (comment #16) has 
the same kind of stacktrace as we are seeing, and the comments in #17 does 
point towards Resin.

The last line in logfile resin-web-stdout.log.20120218 (before rotation) has 
the timestamp 2012-02-18 23:59:59,718.
The first line in the new logfile resin-web-stdout.log.20120219 has the 
timestamp 2012-02-19 01:30:48.878 (here we actually waited to see if the 
problem corrected itself - but alas) which was when we did a restart.
All other logfiles (watchdog, jvm, resin stdout and stderr) does not contain 
any indication of what is wrong.

If it's not Resin causing the problem, I guess we'll just have to switch our 
entire log4j setup to use org.apache.log4j.rolling.RollingFileAppender (part of 
log4j companions), but we'd prefer to stick with the existing setup.

Are there any interesting Resin loggers (and loglevel) that could show 
something?

Any thoughts or possible hints are appriciated.

Regards,
Jens Dueholm Christensen
Rambøll Survey IT





_______________________________________________
resin-interest mailing list
resin-interest@caucho.com<mailto:resin-interest@caucho.com><mailto:resin-interest@caucho.com<mailto:resin-interest@caucho.com>>
http://maillist.caucho.com/mailman/listinfo/resin-interest




_______________________________________________
resin-interest mailing list
resin-interest@caucho.com<mailto:resin-interest@caucho.com>
http://maillist.caucho.com/mailman/listinfo/resin-interest


_______________________________________________
resin-interest mailing list
resin-interest@caucho.com<mailto:resin-interest@caucho.com>
http://maillist.caucho.com/mailman/listinfo/resin-interest




_______________________________________________

resin-interest mailing list

resin-interest@caucho.com<mailto:resin-interest@caucho.com>

http://maillist.caucho.com/mailman/listinfo/resin-interest




_______________________________________________

resin-interest mailing list

resin-interest@caucho.com<mailto:resin-interest@caucho.com>

http://maillist.caucho.com/mailman/listinfo/resin-interest

_______________________________________________
resin-interest mailing list
resin-interest@caucho.com
http://maillist.caucho.com/mailman/listinfo/resin-interest

Reply via email to