Repository: brooklyn-docs
Updated Branches:
  refs/heads/master 7e8166fa1 -> 30aff82ff


Troubleshooting tips for slow Brooklyn


Project: http://git-wip-us.apache.org/repos/asf/brooklyn-docs/repo
Commit: http://git-wip-us.apache.org/repos/asf/brooklyn-docs/commit/669b2e94
Tree: http://git-wip-us.apache.org/repos/asf/brooklyn-docs/tree/669b2e94
Diff: http://git-wip-us.apache.org/repos/asf/brooklyn-docs/diff/669b2e94

Branch: refs/heads/master
Commit: 669b2e94ee46446de7b1f0947e79e97c7f23d78a
Parents: 74a25d1
Author: Aled Sage <aled.s...@gmail.com>
Authored: Tue May 31 01:04:15 2016 +0100
Committer: Aled Sage <aled.s...@gmail.com>
Committed: Mon Jun 6 23:56:14 2016 +0100

----------------------------------------------------------------------
 .../troubleshooting/detailed-support-report.md  |  43 ++++
 guide/ops/troubleshooting/index.md              |   2 +
 guide/ops/troubleshooting/slow-unresponsive.md  | 237 +++++++++++++++++++
 website/documentation/faq.md                    |   2 +-
 4 files changed, 283 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/brooklyn-docs/blob/669b2e94/guide/ops/troubleshooting/detailed-support-report.md
----------------------------------------------------------------------
diff --git a/guide/ops/troubleshooting/detailed-support-report.md 
b/guide/ops/troubleshooting/detailed-support-report.md
new file mode 100644
index 0000000..6e3c741
--- /dev/null
+++ b/guide/ops/troubleshooting/detailed-support-report.md
@@ -0,0 +1,43 @@
+---
+layout: website-normal
+title: Detailed Support Report
+toc: /guide/toc.json
+---
+
+If you wish to send a detailed report, then depending on the nature of the 
problem, consider 
+collecting the following information.
+
+See [Brooklyn Slow or Unresponse](slow-unresponsive.html) docs for details of 
these commands.
+ 
+{% highlight bash %}
+BROOKLYN_HOME=/home/users/brooklyn/apache-brooklyn-0.9.0-bin
+BROOKLYN_PID=$(cat $BROOKLYN_HOME/pid_java)
+REPORT_DIR=/tmp/brooklyn-report/
+DEBUG_LOG=${BROOKLYN_HOME}/brooklyn.debug.log
+
+uname -a > ${REPORT_DIR}/uname.txt
+df -h > ${REPORT_DIR}/df.txt
+cat /proc/cpuinfo > ${REPORT_DIR}/cpuinfo.txt
+cat /proc/meminfo > ${REPORT_DIR}/meminfo.txt
+ulimit -a > ${REPORT_DIR}/ulimit.txt
+cat /proc/${BROOKLYN_PID}/limits >> ${REPORT_DIR}/ulimit.txt
+top -n 1 -b > ${REPORT_DIR}/top.txt
+lsof -p ${BROOKLYN_PID} > ${REPORT_DIR}/lsof.txt
+netstat -an > ${REPORT_DIR}/netstat.txt
+
+jmap -histo:live ${BROOKLYN_PID} > ${REPORT_DIR}/jmap-histo.txt
+jmap -heap ${BROOKLYN_PID} > ${REPORT_DIR}/jmap-heap.txt
+for i in {1..10}; do
+  jstack ${BROOKLYN_PID} > ${REPORT_DIR}/jstack.${i}.txt
+  sleep 1
+done
+grep "brooklyn gc" ${DEBUG_LOG} > ${REPORT_DIR}/brooklyn-gc.txt
+grep "events for subscriber" ${DEBUG_LOG} > 
${REPORT_DIR}/events-for-subscriber.txt
+tar czf brooklyn-report.tgz ${REPORT_DIR}
+{% endhighlight %}
+
+Also consider providing your log files and persisted state, though extreme 
care should be taken if
+these might contain cloud or machine credentials (especially if 
+[Externalised Configuration](({{ site.path.guide 
}}/ops/externalized-configuration.html) 
+is not being used for credential storage).
+

http://git-wip-us.apache.org/repos/asf/brooklyn-docs/blob/669b2e94/guide/ops/troubleshooting/index.md
----------------------------------------------------------------------
diff --git a/guide/ops/troubleshooting/index.md 
b/guide/ops/troubleshooting/index.md
index ee8dfd7..ebbce45 100644
--- a/guide/ops/troubleshooting/index.md
+++ b/guide/ops/troubleshooting/index.md
@@ -5,6 +5,8 @@ children:
 - { path: overview.md, title: Overview }
 - { path: deployment.md, title: Deployment }
 - { path: connectivity.md, title: Server Connectivity }
+- { path: unresponsive.md, title: Brooklyn Slow or Unresponsive }
+- { path: detailed-support-report.md, title:  Detailed Support Report }
 - { path: softwareprocess.md, title: SoftwareProcess Entities }
 - { path: going-deep-in-java-and-logs.md, title: Going Deep in Java and Logs }
 ---

http://git-wip-us.apache.org/repos/asf/brooklyn-docs/blob/669b2e94/guide/ops/troubleshooting/slow-unresponsive.md
----------------------------------------------------------------------
diff --git a/guide/ops/troubleshooting/slow-unresponsive.md 
b/guide/ops/troubleshooting/slow-unresponsive.md
new file mode 100644
index 0000000..0b90e83
--- /dev/null
+++ b/guide/ops/troubleshooting/slow-unresponsive.md
@@ -0,0 +1,237 @@
+---
+layout: website-normal
+title: Brooklyn Slow or Unresponsive
+toc: /guide/toc.json
+---
+
+There are many possible causes for a Brooklyn server becoming slow or 
unresponsive. This guide 
+describes some possible reasons, and some commands and tools that can help 
diagnose the problem.
+
+Possible reasons include:
+* CPU is max'ed out
+* Memory usage is extremely high
+* SSH'ing is very slow due (e.g. due to lack of entropy)
+* Out of disk space
+
+See [Brooklyn Requirements]({{ site.path.guide }}/ops/requirements.html) for 
details of server 
+requirements.
+
+
+## Machine Diagnostics
+
+The following commands will collect OS-level diagnostics about the machine, 
and about the AMP 
+process. The commands below assume use of CentOS 6.x. Minor adjustments may be 
required for
+other platforms.
+
+
+#### OS and Machine Details
+
+To display system information, run:
+
+{% highlight bash %}
+uname -a
+{% endhighlight %}
+
+To show details of the CPU and memory available to the machine, run:
+
+{% highlight bash %}
+cat /proc/cpuinfo
+cat /proc/meminfo
+{% endhighlight %}
+
+
+#### User Limits
+
+To display information about user limits, run the command below (while logged 
in as the same user
+who runs Brooklyn):
+
+{% highlight bash %}
+ulimit -a
+{% endhighlight %}
+
+If Brooklyn is run as a different user (e.g. with user name "adalovelace"), 
then instead run:
+
+{% highlight bash %}
+ulimit -a -u adalovelace
+{% endhighlight %}
+
+Of particular interest is the limit for "open files".
+
+
+#### Disk Space
+
+The command below will list the disk size for each partition, including the 
amount used and 
+available. If the AMP base directory, persistence directory or logging 
directory are close 
+to 0% available, this can cause serious problems:
+
+{% highlight bash %}
+df -h
+{% endhighlight %}
+
+
+#### CPU and Memory Usage
+
+To view the CPU and memory usage of all processes, and of the machine as a 
whole, one can use the 
+`top` command. This runs interactively, updating every few seconds. To collect 
the output once 
+(e.g. to share diagnostic information in a bug report), run:
+ 
+{% highlight bash %}
+top -n 1 -b > top.txt
+{% endhighlight %}
+
+
+#### File and Network Usage
+
+To count the number of open files for the Brooklyn process (which includes 
open socket connections):
+
+{% highlight bash %}
+BROOKLYN_HOME=/home/users/brooklyn/apache-brooklyn-0.9.0-bin
+BROOKLYN_PID=$(cat $BROOKLYN_HOME/pid_java)
+lsof -p $BROOKLYN_PID | wc -l
+{% endhighlight %}
+
+To count (or view the number of "established" internet connections, run:
+
+{% highlight bash %}
+netstat -an | grep ESTABLISHED | wc -l
+{% endhighlight %}
+
+
+#### Linux Kernel Entropy
+
+A lack of entropy can cause random number generation to be extremely slow. 
This can cause
+tasks like ssh to also be extremely slow. See 
+[linux kernel entropy]({{ site.path.website 
}}/documentation/increase-entropy.html)
+for details of how to work around this.
+
+
+## Process Diagnostics
+
+#### Thread and Memory Usage
+
+To get memory and thread usage for the Brooklyn (Java) process, two useful 
tools are `jstack` 
+and `jmap`. These require the "development kit" to also be installed 
+(e.g. `yum install java-1.7.0-openjdk-devel`). Some useful commands are shown 
below:
+
+{% highlight bash %}
+BROOKLYN_HOME=/home/users/brooklyn/apache-brooklyn-0.9.0-bin
+BROOKLYN_PID=$(cat $BROOKLYN_HOME/pid_java)
+
+jstack $BROOKLYN_PID
+jmap -histo:live $BROOKLYN_PID
+jmap -heap $BROOKLYN_PID
+{% endhighlight %}
+ 
+
+#### Runnable Threads
+
+The 
[jstack-active](https://github.com/apache/brooklyn-dist/blob/master/scripts/jstack-active.sh)
+script is a convenient light-weight way to quickly see which threads of a 
running Brooklyn
+server are attempting to consume the CPU. It filters the output of `jstack`, 
to show only the
+"really-runnable" threads (as opposed to those that are blocked).
+
+{% highlight bash %}
+BROOKLYN_HOME=/home/users/brooklyn/apache-brooklyn-0.9.0-bin
+BROOKLYN_PID=$(cat $BROOKLYN_HOME/pid_java)
+
+curl -O 
https://raw.githubusercontent.com/apache/brooklyn-dist/master/scripts/jstack-active.sh
+
+jstack-active $BROOKLYN_PID
+{% endhighlight %}
+
+
+#### Profiling
+
+If an in-depth investigation of the CPU usage (and/or object creation) of a 
Brooklyn Server is
+requiring, there are many profiling tools designed specifically for this 
purpose. These generally
+require that the process be launched in such a way that a profiler can attach, 
which may not be
+appropriate for a production server.
+
+
+#### Remote Debugging
+
+If the Brooklyn Server was originally run to allow a remote debugger to 
connect (strongly 
+discouraged in production!), then this provides a convenient way to 
investigate why Brooklyn
+is being slow or unresonsive. See the Debugging Tips in the 
+tip [Debugging Remote Brooklyn][({{ site.path.guide 
}}/dev/tips/debugging-remote-brooklyn.html)
+and the the [IDE docs](See [Brooklyn Requirements]({{ site.path.guide 
}}/dev/env/ide/) for more
+information.
+
+
+## Log Files
+
+Apache Brooklyn will by default create brooklyn.info.log and 
brooklyn.debug.log files. See the
+[Logging](({{ site.path.guide }}/ops/logging.html) docs for more information.
+
+The following are useful log messages to search for (e.g. using `grep`). Note 
the wording of
+these messages (or their very presence) may change in future version of 
Brooklyn. 
+
+
+#### Normal Logging
+
+The lines below are commonly logged, and can be useful to search for when 
finding the start of a section of logging.
+
+{% highlight %}
+2016-05-30 17:05:51,458 INFO  o.a.b.l.BrooklynWebServer [main]: Started 
Brooklyn console at http://127.0.0.1:8081/, running classpath://brooklyn.war
+2016-05-30 17:06:04,098 INFO  o.a.b.c.m.h.HighAvailabilityManagerImpl [main]: 
Management node tF3GPvQ5 running as HA MASTER autodetected
+2016-05-30 17:06:08,982 INFO  o.a.b.c.m.r.InitialFullRebindIteration 
[brooklyn-execmanager-rvpnFTeL-0]: Rebinding from 
/home/compose/compose-amp-state/brooklyn-persisted-state/data for master 
rvpnFTeL...
+2016-05-30 17:06:11,105 INFO  o.a.b.c.m.r.RebindIteration 
[brooklyn-execmanager-rvpnFTeL-0]: Rebind complete (MASTER) in 2s: 19 apps, 54 
entities, 50 locations, 46 policies, 704 enrichers, 0 feeds, 160 catalog items
+{% endhighlight %}
+
+
+#### Memory Usage
+
+The debug log includes (every minute) a log statement about the memory usage 
and task activity. For example:
+
+{% highlight %}
+2016-05-27 12:20:19,395 DEBUG o.a.b.c.m.i.BrooklynGarbageCollector 
[brooklyn-gc]: brooklyn gc (before) - using 328 MB / 496 MB memory (5.58 kB 
soft); 69 threads; storage: {datagrid={size=7, createCount=7}, refsMapSize=0, 
listsMapSize=0}; tasks: 10 active, 33 unfinished; 78 remembered, 1696906 total 
submitted)
+2016-05-27 12:20:19,395 DEBUG o.a.b.c.m.i.BrooklynGarbageCollector 
[brooklyn-gc]: brooklyn gc (after) - using 328 MB / 496 MB memory (5.58 kB 
soft); 69 threads; storage: {datagrid={size=7, createCount=7}, refsMapSize=0, 
listsMapSize=0}; tasks: 10 active, 33 unfinished; 78 remembered, 1696906 total 
submitted)
+{% endhighlight %}
+
+These can be extremely useful if investigating a memory or thread leak, or to 
determine whether a 
+surprisingly high number of tasks are being executed.
+
+
+#### Subscriptions
+
+One source of high CPU in Brooklyn is when a subscription (e.g. for a policy 
or enricher) is being 
+triggered many times (i.e. handling many events). A log message like that 
below will be logged on 
+every 1000 events handled by a given single subscription.
+
+{% highlight %}
+2016-05-30 17:29:09,125 DEBUG o.a.b.c.m.i.LocalSubscriptionManager 
[brooklyn-execmanager-rvpnFTeL-8]: 1000 events for subscriber 
Subscription[SCFnav9g;CanopyComposeApp{id=gIeTwhU2}@gIeTwhU2:webapp.url]
+{% endhighlight %}
+
+If a subscription is handling a huge number of events, there are a couple of 
common reasons:
+* first, it could be subscribing to too much activity - e.g. a wildcard 
subscription for all 
+  events from all entities.
+* second it could be an infinite loop (e.g. where an enricher responds to a 
sensor-changed event
+  by setting that same sensor, thus triggering another sensor-changed event).
+
+
+#### User Activity
+
+All activity triggered by the REST API or web-console will be logged. Some 
examples are shown below:
+
+{% highlight %}
+2016-05-19 17:52:30,150 INFO  o.a.b.r.r.ApplicationResource 
[brooklyn-jetty-server-8081-qtp1058726153-17473]: Launched from YAML: name: My 
Example App
+location: aws-ec2:us-east-1
+services:
+- type: org.apache.brooklyn.entity.webapp.tomcat.TomcatServer
+
+2016-05-30 14:46:19,516 DEBUG brooklyn.REST 
[brooklyn-jetty-server-8081-qtp1104967201-20881]: Request Tisj14 starting: POST 
/v1/applications/NiBy0v8Q/entities/NiBy0v8Q/expunge from 77.70.102.66
+{% endhighlight %}
+
+
+#### Entity Activity
+
+If investigating the behaviour of a particular entity (e.g. on failure), it 
can be very useful to 
+`grep` the info and debug log for the entity's id. For a software process, the 
debug log will 
+include the stdout and stderr of all the commands executed by that entity.
+
+It can also be very useful to search for all effector invocations, to see 
where the behaviour
+has been triggered:
+
+{% highlight %}
+2016-05-27 12:45:43,529 DEBUG o.a.b.c.m.i.EffectorUtils 
[brooklyn-execmanager-gvP7MuZF-14364]: Invoking effector stop on 
TomcatServerImpl{id=mPujYmPd}
+{% endhighlight %}

http://git-wip-us.apache.org/repos/asf/brooklyn-docs/blob/669b2e94/website/documentation/faq.md
----------------------------------------------------------------------
diff --git a/website/documentation/faq.md b/website/documentation/faq.md
index 7af5f80..483d686 100644
--- a/website/documentation/faq.md
+++ b/website/documentation/faq.md
@@ -31,7 +31,7 @@ You could encounter this error when running with many 
entities.
 Please **increase the ulimit** if you see such error:
 
 On the VM running Apache Brooklyn, we recommend ensuring nproc and nofile are 
reasonably high (e.g. higher than 1024, which is often the default).
-We recommend setting it limits to a value above 16000.
+We recommend setting it limits to a value of 16384 or higher.
 
 If you want to check the current limits run `ulimit -a`.
 

Reply via email to