Further questions: Where were you running this process - on your workstation, or on a Linux server elsewhere? What is the spec/OS of the machine running Brooklyn? How much "stuff" was going on in Brooklyn? (large number of entites or SSH sensors...)
If this was on a Linux server, I'm wondering if the JVM holding Brooklyn used up all of the server's memory and the OOM killer was invoked. If so you should see a message in the output of the "dmesg" command, or in /var/log/syslog or /var/log/messages. Richard. On 29 January 2015 at 13:11, Aled Sage <[email protected]> wrote: > Hi Sam, > > First quick questions: > > * Was brooklyn definitely run with `nohup` or `disown`? > * Are you running with any unusual entities that might inadvertently > have a System.exit or some such?! > * I presume there was no core dump file in the run directory? > > The InterruptedException suggest this might be a relatively gracefully > shutdown. Do you see evidence that the shutdown hook has been called (so the > management context was shut down cleanly)? > > Aled > > > On 29/01/2015 11:07, Sam Corbett wrote: >> >> Hi all, >> >> I'd like help getting to the bottom of the unexpected termination of a >> Brooklyn process. It hit me twice in a row yesterday, once with nothing >> weird in the logs and once with a number of stacktraces indicating an >> InterruptedException was thrown. Both processes were run on the same host >> in Softlayer and were running Clocker. >> >> The first deployment seemed to be working normally. I had deployed a few >> applications to my-docker-cloud and stopped them again. A moment later and >> the process had stopped. The last thing the server did was check the >> status >> of a Weave container: >> >> 2015-01-28 07:16:15,002 DEBUG brooklyn.SSH >> [brooklyn-execmanager-BL31ZSeZ-2001]: check-running >> WeaveContainerImpl{id=r3fpQokV}, on machine >> SshMachineLocation[159.8.36.8:159.8.36.8/159.8.36.8:22@DKRM1V05], >> completed: return status 0 >> 2015-01-28 07:16:15,371 DEBUG b.launcher.BrooklynWebServer >> [shutdownHookThread]: BrooklynWebServer detected shutdown: stopping >> web-console >> >> There were no interesting exceptions in the debug log. >> >> In the second case the process stopped as Brooklyn waited for the status >> of >> a service that did not provision. This time there was a (lot of) >> stacktrace >> in the logs. Most pertinently perhaps was: >> >> 2015-01-28 09:07:49,891 DEBUG b.u.task.BasicExecutionManager >> [brooklyn-execmanager-YRXvc51z-1676]: Exception running task >> Task[post-start:ihocjzth] (rethrowing): java.lang.InterruptedException: >> sleep interrupted >> brooklyn.util.exceptions.RuntimeInterruptedException: >> java.lang.InterruptedException: sleep interrupted >> at brooklyn.util.exceptions.Exceptions.propagate(Exceptions.java:89) >> ~[brooklyn-utils-common-0.7.0-SNAPSHOT.jar:0.7.0-SNAPSHOT] >> at brooklyn.util.time.Time.sleep(Time.java:312) >> ~[brooklyn-utils-common-0.7.0-SNAPSHOT.jar:0.7.0-SNAPSHOT] >> at brooklyn.util.time.Time.sleep(Time.java:318) >> ~[brooklyn-utils-common-0.7.0-SNAPSHOT.jar:0.7.0-SNAPSHOT] >> at brooklyn.util.repeat.Repeater.runKeepingError(Repeater.java:382) >> ~[brooklyn-utils-common-0.7.0-SNAPSHOT.jar:0.7.0-SNAPSHOT] >> at brooklyn.util.repeat.Repeater.run(Repeater.java:305) >> ~[brooklyn-utils-common-0.7.0-SNAPSHOT.jar:0.7.0-SNAPSHOT] >> at brooklyn.entity.basic.Entities.waitForServiceUp(Entities.java:1028) >> ~[brooklyn-core-0.7.0-SNAPSHOT.jar:0.7.0-SNAPSHOT] >> at >> >> brooklyn.entity.basic.SoftwareProcessImpl.waitForServiceUp(SoftwareProcessImpl.java:370) >> ~[brooklyn-software-base-0.7.0-SNAPSHOT.jar:0.7.0-SNAPSHOT] >> at >> >> brooklyn.entity.basic.SoftwareProcessImpl.waitForServiceUp(SoftwareProcessImpl.java:367) >> ~[brooklyn-software-base-0.7.0-SNAPSHOT.jar:0.7.0-SNAPSHOT] >> at >> >> brooklyn.entity.basic.SoftwareProcessDriverLifecycleEffectorTasks.postStartCustom(SoftwareProcessDriverLifecycleEffectorTasks.java:160) >> ~[brooklyn-software-base-0.7.0-SNAPSHOT.jar:0.7.0-SNAPSHOT] >> at >> >> brooklyn.entity.software.MachineLifecycleEffectorTasks$7.run(MachineLifecycleEffectorTasks.java:431) >> ~[brooklyn-software-base-0.7.0-SNAPSHOT.jar:0.7.0-SNAPSHOT] >> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) >> ~[na:1.7.0_65] >> at >> >> brooklyn.util.task.DynamicSequentialTask$DstJob.call(DynamicSequentialTask.java:337) >> ~[brooklyn-core-0.7.0-SNAPSHOT.jar:0.7.0-SNAPSHOT] >> at >> >> brooklyn.util.task.BasicExecutionManager$SubmissionCallable.call(BasicExecutionManager.java:469) >> ~[brooklyn-core-0.7.0-SNAPSHOT.jar:0.7.0-SNAPSHOT] >> at java.util.concurrent.FutureTask.run(FutureTask.java:262) [na:1.7.0_65] >> at >> >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> [na:1.7.0_65] >> at >> >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> [na:1.7.0_65] >> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65] >> Caused by: java.lang.InterruptedException: sleep interrupted >> at java.lang.Thread.sleep(Native Method) [na:1.7.0_65] >> at brooklyn.util.time.Time.sleep(Time.java:310) >> ~[brooklyn-utils-common-0.7.0-SNAPSHOT.jar:0.7.0-SNAPSHOT] >> ... 15 common frames omitted >> >> I've got full logs for each run of the server but I didn't get the exit >> code of either process. I ran a third test on the same host later in the >> day and nothing went wrong (in a reasonable timeframe). >> >> Has anyone experienced this before? Are there any system logs I could have >> looked to for more information? A brief look at the standard /var/log >> files >> revealed nothing. >> >> It was a bit alarming to see that in the first instance the process >> stopped >> with no indication why. >> >> Sam >> >
