On 2/12/13 1:50 AM, Jens Dueholm Christensen wrote:


We're running Resin 3.1.11 (soon to be 3.1.12 in the next servicewindow) in our production environment, and a few days ago we had an app that was restarted several times by the watchdog -- with no apparent reason.

The watchdog-log contains this:

[2013/02/08 11:17:46.478] WatchdogProcess[Watchdog[results],1] stopping Resin

[2013/02/08 11:17:46.478] WatchdogProcess[Watchdog[results],2] starting Resin

[2013/02/08 11:20:37.256] WatchdogProcess[Watchdog[results],2] stopping Resin

[2013/02/08 11:20:37.256] WatchdogProcess[Watchdog[results],3] starting Resin

[2013/02/08 11:23:24.221] WatchdogProcess[Watchdog[results],3] stopping Resin

[2013/02/08 11:23:24.221] WatchdogProcess[Watchdog[results],4] starting Resin

There was a lot of regular and normal activity in our apps stdout-log before and inbetween all the restarts, but nothing that -- in our opinion -- should cause a restart by the watchdog. The JVM log has no mention of problems (performing CMS and young generation GC as expected) and load on the server was also low -- no automatic stacktraces were taken.

Well, remember that the watchdog itself doesn't normally shutdown Resin on errors. Resin exits itself and the watchdog just starts a new instance. (Resin 4.0 communicates the reason better to the watchdog through exit codes.)

So the problem is in the Resin instance itself.

Are there hs_err* files or something similar?

-- Scott

We have been running with the same resin configuration, app codebase and OS software-stack for a long time, so we are quite baffeled, as this struck us as lightning from a clear sky.

Is there any way of getting more verbose output about the watchdog and what it decides to do?

We tried setting <logger name="com.caucho" level="fine"/> and restarting Resin completely (not just a restart of the JVM), but that did not seem to help.


*Jens Dueholm Christensen
*Survey IT

resin-interest mailing list

resin-interest mailing list

Reply via email to