Github user sjcorbett commented on a diff in the pull request:
https://github.com/apache/incubator-brooklyn/pull/762#discussion_r35201850
--- Diff: docs/guide/dev/tips/troubleshooting-exceptions.md ---
@@ -0,0 +1,487 @@
+---
+layout: website-normal
+title: Troubleshooting Exceptions and Node Failure
+toc: /guide/toc.json
+---
+
+Whether you're customizing out-of-the-box blueprints, or developing your
own custom blueprints, you will
+inevitably have to deal with node failure, or exceptions being thrown by
your node. Thankfully Brooklyn
+provides plenty of information to help you locate and resolve any issues
you may encounter.
+
+This guide looks at three common failure scenarios and describes the steps
that can be taken to
+identify the issue.
+
+## Script failure
+Many blueprints run bash scripts as part of the installation. This section
highlights how to identify a problem with
+a bash script.
+
+First let's take a look at the `customize()` method of the Tomcat server
blueprint:
+
+{% highlight java %}
+ @Override
+ public void customize() {
+ newScript(CUSTOMIZING)
+ .body.append("mkdir -p conf logs webapps temp")
+ .failOnNonZeroResultCode()
+ .execute();
+
+ copyTemplate(entity.getConfig(TomcatServer.SERVER_XML_RESOURCE),
Os.mergePaths(getRunDir(), "conf", "server.xml"));
+ copyTemplate(entity.getConfig(TomcatServer.WEB_XML_RESOURCE),
Os.mergePaths(getRunDir(), "conf", "web.xml"));
+
+ // Deduplicate same code in JBoss
+ if (isProtocolEnabled("HTTPS")) {
+ String keystoreUrl =
Preconditions.checkNotNull(getSslKeystoreUrl(), "keystore URL must be specified
if using HTTPS for " + entity);
+ String destinationSslKeystoreFile = getHttpsSslKeystoreFile();
+ InputStream keystoreStream =
resource.getResourceFromUrl(keystoreUrl);
+ getMachine().copyTo(keystoreStream, destinationSslKeystoreFile);
+ }
+
+ getEntity().deployInitialWars();
+ }
+{% endhighlight %}
+
+Here we can see that it's running a script to create four directories
before continuing with the customization. Let's
+introduce an error by changing `mkdir` to `mkrid`:
+
+{% highlight java %}
+ newScript(CUSTOMIZING)
+ .body.append("mkrid -p conf logs webapps temp") // `mkdir`
changed to `mkrid`
+ .failOnNonZeroResultCode()
+ .execute();
+{% endhighlight %}
+
+Now let's try deploying this using the following YAML:
+
+{% highlight yaml %}
+
+name: Tomcat failure test
+location: localhost
+services:
+- type: brooklyn.entity.webapp.tomcat.TomcatServer
+
+{% endhighlight %}
+
+Shortly after deployment, the entity fails with the following error:
+
+`Failure running task ssh: customizing TomcatServerImpl{id=e1HP2s8x}
(HmyPAozV):
+Execution failed, invalid result 127 for customizing
TomcatServerImpl{id=e1HP2s8x}`
+
+[](images/script-failure-large.png)
+
+By selecting the `Activities` tab and drilling down into the tasks, we
eventually get to the task that failed:
+
+[](images/failed-task-large.png)
+
+By clicking on the `stderr` link, we can see the script failed with the
following error:
+
+{% highlight console %}
+/tmp/brooklyn-20150721-132251052-l4b9-customizing_TomcatServerImpl_i.sh:
line 10: mkrid: command not found
+{% endhighlight %}
+
+This tells us *what* went wrong, but doesn't tell us *where*. In order to
find that, we'll need to look at the
+stack trace that was logged when the exception was thrown.
+
+It's always worth looking at the Detailed Status section as sometimes this
will give you the information you need.
+In this case, the stack trace is limited to the thread that was used to
execute the task that ran the script:
+
+{% highlight console %}
+Failed after 40ms
+
+STDERR
+/tmp/brooklyn-20150721-132251052-l4b9-customizing_TomcatServerImpl_i.sh:
line 10: mkrid: command not found
+
+
+STDOUT
+Executed
/tmp/brooklyn-20150721-132251052-l4b9-customizing_TomcatServerImpl_i.sh, result
127: Execution failed, invalid result 127 for customizing
TomcatServerImpl{id=e1HP2s8x}
+
+java.lang.IllegalStateException: Execution failed, invalid result 127 for
customizing TomcatServerImpl{id=e1HP2s8x}
+ at
brooklyn.entity.basic.lifecycle.ScriptHelper.logWithDetailsAndThrow(ScriptHelper.java:390)
+ at
brooklyn.entity.basic.lifecycle.ScriptHelper.executeInternal(ScriptHelper.java:379)
+ at
brooklyn.entity.basic.lifecycle.ScriptHelper$8.call(ScriptHelper.java:289)
+ at
brooklyn.entity.basic.lifecycle.ScriptHelper$8.call(ScriptHelper.java:287)
+ at
brooklyn.util.task.DynamicSequentialTask$DstJob.call(DynamicSequentialTask.java:343)
+ at
brooklyn.util.task.BasicExecutionManager$SubmissionCallable.call(BasicExecutionManager.java:469)
+ at java.util.concurrent.FutureTask.run(FutureTask.java:262)
+ at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
+ at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
+at java.lang.Thread.run(Thread.java:745)
+{% endhighlight %}
+
+In order to find the exception, we'll need to look in Brooklyn's debug log
file. By default, the debug log file
--- End diff --
Irrelevant to this PR: we shouldn't have to look at the log file to get
this.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---