Re: [Resin-interest] Best Way to Track Down Random Resin Restarts

Scott Ferguson Thu, 17 Feb 2011 12:07:10 -0800

Shane Cruz wrote:
> So, with full debug logging turned on, I did see this exception in the 
> logs right before the restart:
>
> [13:55:37.603] com.caucho.log.EnvironmentLogger.log 
> com.caucho.config.ConfigException: OpenSSL can't open 
> certificate-chain-file '/nfs/certs/mysite.crt'
> [13:55:37.603]  at com.caucho.vfs.OpenSSLFactory.open(Native Method)
> [13:55:37.603]  at 
> com.caucho.vfs.OpenSSLFactory.accept(OpenSSLFactory.java:419)
> [13:55:37.603]  at com.caucho.server.port.Port.accept(Port.java:813)
> [13:55:37.603]  at 
> com.caucho.server.port.TcpConnection.run(TcpConnection.java:495)
> [13:55:37.603]  at 
> com.caucho.util.ThreadPool.runTasks(ThreadPool.java:520)
> [13:55:37.603]  at com.caucho.util.ThreadPool.run(ThreadPool.java:442)
> [13:55:37.603]  at java.lang.Thread.run(Thread.java:619)
> [13:55:37.603]
> [13:55:49.109] com.caucho.log.EnvironmentLogger.log Server[myserver1] 
> starting
>
> That certificate is getting loaded over NFS. Is there a chance that a 
> certificate loading failure due to an NFS issue could cause the JVM to 
> exit?  I thought the certificate would just be loaded one time at 
> startup, but it looks like maybe it accesses it during runtime as well.


Possibly an issue running out of file descriptors?

That exception shouldn't cause a restart directly. It would cause that 
thread to exit, but would also start up a new thread to listen to that 
port (because it's assuming the current thread is broken for some reason.)

But you could get a "can't open" if you run out of file descriptors, and 
running out of  file descriptors can force a restart.

-- Scott
>
> Unfortunately, on a different JVM, there was a crash that doesn't seem 
> to have the same exception:
>
> [13:36:03.102] com.caucho.log.EnvironmentLogger.log allocate 
> PoolItem[jdbc/db1,3340053,com.caucho.sql.ManagedConnectionImpl@744ab820]
> [13:36:03.102] com.caucho.log.EnvironmentLogger.log allocate 
> PoolItem[jdbc/db2,1020267,com.caucho.sql.ManagedConnectionImpl@2a121a07]
> [13:36:16.815] com.caucho.log.EnvironmentLogger.log Server[myserver2] 
> starting
>
> Scott, what are your thoughts on the certificate issue?  To be safe, 
> we should probably start by not loading the certificate over an NFS share.
>
> Thanks,
> Shane
>
> On Fri, Feb 11, 2011 at 1:40 PM, Scott Ferguson <[email protected] 
> <mailto:[email protected]>> wrote:
>
>     Shane Cruz wrote:
>     > We are running Resin Pro 3.0.25 on RHEL 5.5 and using 64-bit Sun JDK
>     > 1.6.0_05.  Recently, we have started seeing several incidents where
>     > the Resin JVM seems to just randomly get restarted.  There is
>     nothing
>     > in the logs to indicate that the JVM was shutdown cleanly or a
>     restart
>     > was attempted, the log files just go from displaying regular log
>     lines
>     > to displaying the following:
>     The logging for 4.0 is much more informative. With 3.0 it's a bit
>     trickier.
>     >
>     > [11:24:18.095] com.caucho.log.EnvironmentLogger.log Server[myserver]
>     > starting
>     >
>     > Things that have already been checked:
>     >
>     > 1. There doesn’t appear to be a JVM crash as no HotSpot Error log
>     > files are created as they usually would be.
>     >
>     > 2. There are no signs in the sudo logs that anyone is manually
>     > restarting the JVM.
>     >
>     > 3. There are no signs in the logs that Resin is restarting
>     itself even
>     > though we have a “min-free-memory” setting of 1M.  With higher
>     values
>     > of that setting we have seen the JVM get restarted due to low
>     memory,
>     > but I am pretty sure logging always indicated that the JVM was
>     > restarting when this happened before.
>     >
>     > 4. We are not using the resin “ping” check that might restart
>     the JVM
>     > if it is unresponsive.
>     >
>     > 5.     Kernel logging is enabled and it doesn't look like the kernel
>     > is killing it for any reason
>     >
>     > It almost seems as if the JVM is just getting a kill -9 and then the
>     > wrapper script is starting it back up.  What is the best way to
>     track
>     > down what might be killing the JVM?  We are in the process of
>     testing
>     > an upgrade to a newer version of the JDK, but I am not very
>     confident
>     > that will fix the problem.  I am going to try to turn on full Resin
>     > debug logging, but I thought I would reach out in case anyone
>     else had
>     > an idea of how to track this down.  Is there a way to wrap the Linux
>     > kill command to find out if that is being run?  Any other
>     suggestions
>     > on where to look?
>     Since a phantom kill is pretty unlikely, I wouldn't spend too much
>     time
>     on that theory.
>
>     Since you're not getting a hs_* error, the most likely would be either
>     something calling System.exit or System.halt, possibly Resin
>     itself for
>     something like running out of threads or memory (although, as you
>     pointed out, that should be logged.)
>
>     Other than that, the restart should only happen if the config files
>     change (theoretically something like NFS or 'touch' could trigger
>     that,
>     but I assume that's not happening.)
>
>     -- Scott
>     >
>     ------------------------------------------------------------------------
>     >
>     > _______________________________________________
>     > resin-interest mailing list
>     > [email protected] <mailto:[email protected]>
>     > http://maillist.caucho.com/mailman/listinfo/resin-interest
>     >
>
>
>
>     _______________________________________________
>     resin-interest mailing list
>     [email protected] <mailto:[email protected]>
>     http://maillist.caucho.com/mailman/listinfo/resin-interest
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> resin-interest mailing list
> [email protected]
> http://maillist.caucho.com/mailman/listinfo/resin-interest
>   



_______________________________________________
resin-interest mailing list
[email protected]
http://maillist.caucho.com/mailman/listinfo/resin-interest

Re: [Resin-interest] Best Way to Track Down Random Resin Restarts

Reply via email to