Github user clebertsuconic commented on a diff in the pull request:
--- Diff: docs/user-manual/en/critical-analysis.md ---
@@ -0,0 +1,32 @@
+# Critical Analysis of the broker
+There are a few things that can go wrong on a production environment:
+- Bugs, for more than we try they still happen! We always try to correct
them, but that's the only constant in software development.
+- IO Errors, disks and hardware can go bad
+- Memory issues, the CPU can go crazy by another process
+For cases like this, we added a protection to the broker to shut itself
down when bad things happen.
+We measure time response in places like:
+- Queue delivery (add to the queue)
+- Journal storage
+- Paging operations
+If the response time goes beyond a configured timeout, the broker is
considered unstable and an action will be taken to either shutdown the broker
or halt the VM.
+You can use these following configuration options on broker.xml to
configure how the critical analysis is performed.
+Name | Description
+:--- | :---
+analyze-critical | Enable or disable the critical analysis (default true)
+analyze-critical-timeout | Timeout used to do the critical analysis
(default 120000 milliseconds)
+analyze-critical-check-period | Time used to check the response times
(default half of analyze-critical-timeout)
+analyze-critical-halt | Should the VM be halted upon failures (default
--- End diff --
The idea here was to add monitoring on internal objects similar to what we
do on pings / pongs through the protocols. I didn't want to make it any more
complex than needed.
This subject could be raised to become an independent project if you keep
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket