[daemon] Java Service stop acting differently on different machines

Andrew Miller Sun, 18 Aug 2013 16:55:30 -0700

I have written a Service Wrapper class that makes use of the 1.0.15 Procrun
binaries on Windows in JVM mode for use in a larger project that has
Service components.  On most machines we have deployed the wrapper to it is
behaving normally.  However on some machines when a user directs the
service to stop via the Windows Control Panel an error dialog pops up that
the service stopped unexpectedly (Error code 1067).  Because we also
configure the service to restart on errors the service is restarted again
(which is great when real unexpected problems occur, but not when a user
wants to stop the service intentionally, which in turns leads to other
undesired side effects).


In general there isn't a major difference between the machines that run the
Service well and those that exhibiting the error.  All machines are Windows
7 64 bit and the service is installed and running using the same amd64
procrun binaries.

On a machine where the shutdown happens cleanly we see this trace:
[2013-08-14 12:55:52] [debug] ( prunsrv.c:844 ) [ 2300]
reportServiceStatusE: 3, 0, 3000, 0
[2013-08-14 12:55:52] [info]  ( prunsrv.c:943 ) [ 3744] Stopping service...
[2013-08-14 12:55:52] [debug] ( javajni.c:888 ) [ 7620] argv[0] =
com.xx_xx.abcd.system.capture.efgh.MasterService
[2013-08-14 12:55:52] [debug] ( javajni.c:941 ) [ 7620] Java Worker thread
started com/xx_xx/abcd/system/common/ServiceWrapper:stop
[2013-08-14 12:55:52] [debug] ( javajni.c:964 ) [ 7620] Java Worker thread
finished com/xx_xx/abcd/system/common/ServiceWrapper:stop with status=0
[2013-08-14 12:55:53] [debug] ( prunsrv.c:990 ) [ 3744] Waiting for java
jni stop worker to finish...
[2013-08-14 12:55:53] [debug] ( prunsrv.c:992 ) [ 3744] Java jni stop
worker finished.
[2013-08-14 12:55:53] [debug] ( prunsrv.c:844 ) [ 3744]
reportServiceStatusE: 3, 0, 300000, 0
[2013-08-14 12:55:53] [debug] ( prunsrv.c:1093) [ 3744] Waiting for worker
to die naturally...
[2013-08-14 12:55:53] [debug] ( prunsrv.c:1104) [ 3744] Worker finished
gracefully in 0 ms.
[2013-08-14 12:55:53] [info]  ( prunsrv.c:1114) [ 3744] Service stop thread
completed.
[2013-08-14 12:55:55] [debug] ( javajni.c:471 ) [ 7452] Exit hook with exit
code 1
[2013-08-14 12:55:55] [debug] ( prunsrv.c:910 ) [ 7452] Stop exit hook
called ...
[2013-08-14 12:55:55] [debug] ( prunsrv.c:844 ) [ 7452]
reportServiceStatusE: 1, 0, 0, 0
[2013-08-14 12:55:55] [debug] ( prunsrv.c:919 ) [ 7452] Start exit hook
called ...
[2013-08-14 12:55:55] [debug] ( prunsrv.c:920 ) [ 7452] VM exit code: 1

On the machines where shutdown fails with the 1067 error code we see this
trace:
[2013-08-15 15:22:49] [debug] ( prunsrv.c:844 ) [ 6504]
reportServiceStatusE: 3, 0, 3000, 0
[2013-08-15 15:22:49] [info]  ( prunsrv.c:943 ) [ 6644] Stopping service...
[2013-08-15 15:22:49] [debug] ( javajni.c:888 ) [ 7396] argv[0] =
com.xx_xx.abcd.system.capture.efgh.MasterService
[2013-08-15 15:22:49] [debug] ( javajni.c:941 ) [ 7396] Java Worker thread
started com/xx_xx/abcd/system/common/ServiceWrapper:stop
[2013-08-15 15:22:49] [debug] ( javajni.c:964 ) [ 7396] Java Worker thread
finished com/xx_xx/abcd/system/common/ServiceWrapper:stop with status=0
[2013-08-15 15:22:50] [debug] ( prunsrv.c:990 ) [ 6644] Waiting for java
jni stop worker to finish...
[2013-08-15 15:22:50] [debug] ( prunsrv.c:992 ) [ 6644] Java jni stop
worker finished.
[2013-08-15 15:22:50] [debug] ( prunsrv.c:844 ) [ 6644]
reportServiceStatusE: 3, 0, 300000, 0
[2013-08-15 15:22:50] [debug] ( prunsrv.c:1093) [ 6644] Waiting for worker
to die naturally...
[2013-08-15 15:22:50] [debug] ( prunsrv.c:1104) [ 6644] Worker finished
gracefully in 0 ms.
[2013-08-15 15:22:50] [info]  ( prunsrv.c:1114) [ 6644] Service stop thread
completed.
[2013-08-15 15:22:52] [debug] ( javajni.c:471 ) [ 6948] Exit hook with exit
code 1

Can obviously see on the machine where the shutdown fails we don't see the
[debug] lines which lead to the VM exit code log.  Seems like the JVM is
crashing, which leads Windows to detect the Service as having stopped
unexpectedly, but whatever is causing the crash isn't leaving any trace
that we can find, either in the commons-daemon log or in the logs generated
by our own application code.  Anyone have any thoughts as to what might
happening?  Any other jvm debug settings that can be turned on or something
that might be helpful?

[daemon] Java Service stop acting differently on different machines

Reply via email to