Github user aledsage commented on a diff in the pull request:
https://github.com/apache/brooklyn-docs/pull/78#discussion_r68122663
--- Diff: guide/ops/high-availability/high-availability-supplemental.md ---
@@ -0,0 +1,142 @@
+---
+title: High Availability (Supplemental)
+layout: website-normal
+---
+
+This document supplements the High Availability documentation available
[here](http://brooklyn.apache.org/v/latest/ops/high-availability.html)
+and provides an example of how to configure a pair of Apache Brooklyn
servers to run in master-standby mode with a shared NFS datastore
+
+### Prerequisites
+- Two VMs (or physical machines) have been provisioned
+- NFS or another suitable file system has been configured and is available
to both VMs*
+- An NFS folder has been mounted on both VMs at
`/mnt/brooklyn-persistence` and both machines can write to the folder
+
+\* Brooklyn can be configured to use either an object store such as S3, or
a shared NFS mount. The recommended option is to use an object
+store as described in the [Object Store
Persistence](./persistence/#object-store-persistence) documentation. For
clarity, a shared NFS folder
+is assumed in this example
+
+### Launching
+To start, download and install the latest Apache Brooklyn release on both
VMs following the 'OSX / Linux' section
+of the [Running Apache
Brooklyn](../start/running.html#install-apache-brooklyn) documentation
+
+On the first VM, which will be the master node, run the following to start
Brooklyn in high availability mode:
+
+{% highlight bash %}
+$ bin/brooklyn launch --highAvailability master --persist auto
--persistenceDir /mnt/brooklyn-persistence
+{% endhighlight %}
+
+Once Brooklyn has launched, on the second VM, run the following command to
launch Brooklyn in standby mode:
+
+{% highlight bash %}
+$ bin/brooklyn launch --highAvailability auto --persist auto
--persistenceDir /mnt/brooklyn-persistence
+{% endhighlight %}
+
+### Testing
+You can now confirm that Brooklyn is running in high availibility mode on
the master by logging into the web console at `http://<ip-address>:8081`.
+Similarly you can log into the web console on the standby VM where you
will see a warning that the server is not the high availability master.
+To test a failover, you can simply terminate the process on the first VM
and log into the web console on the second VM. Upon launch, Brooklyn will
+output its PID to the file `pid.txt`; you can terminate the process by
running the following command from the same directory from which you
+launched Brooklyn:
+
+{% highlight bash %}
+$ kill -9 $(cat pid.txt)
+{% endhighlight %}
+
+It is also possiblity to check the high availability state of a running
Brooklyn server using the following curl command:
+
+{% highlight bash %}
+$ curl -u myusername:mypassword http://<ip-address>:8081/v1/server/ha/state
+{% endhighlight %}
+
+This will return one of the following states:
+
+{% highlight bash %}
+
+"INITIALIZING"
+"STANDBY"
+"HOT_STANDBY"
+"HOT_BACKUP"
+"MASTER"
+"FAILED"
+"TERMINATED"
+
+{% endhighlight %}
+
+Note: The quotation characters will be included in the reply
+
+To obtain information about all of the nodes in the cluster, run the
following command against any of the nodes in the cluster:
+
+{% highlight bash %}
+$ curl -u myusername:mypassword
http://<ip-address>:8081/v1/server/ha/states
+{% endhighlight %}
+
+This will return a JSON document describing the Brooklyn nodes in the
cluster. An example of two HA Brooklyn nodes is as follows (whitespace
formatting has been
+added for clarity):
+
+{% highlight yaml %}
+
+{
+ ownId: "XkJeXUXE",
+ masterId: "yAVz0fzo",
+ nodes: {
+ yAVz0fzo: {
+ nodeId: "yAVz0fzo",
+ nodeUri: "http://<server1-ip-address>:8081/",
+ status: "MASTER",
+ localTimestamp: 1466414301065,
+ remoteTimestamp: 1466414301000
+ },
+ XkJeXUXE: {
+ nodeId: "XkJeXUXE",
+ nodeUri: "http://<server2-ip-address>:8081/",
+ status: "STANDBY",
+ localTimestamp: 1466414301066,
+ remoteTimestamp: 1466414301000
+ }
+ },
+ links: { }
+}
+
+{% endhighlight %}
+
+The examples above show how to use `curl` to manually check the status of
Brooklyn via its REST API. The same REST API calls can also be used by
+automated third party monitoring tools such as Monit
+
+### Failover
+When running as a HA standby node, each standby Brooklyn server (in this
case there is only one standby) will check the shared persisted state
+every 1 second to determine the state of the HA master. If no heartbeat
has been recorded for thirty seconds, then an election will be performed
+and one of the standby nodes will be promoted to master. At this point all
requests should be directed to the new master node
+
+In the event that tasks - such as the provisioning of a new entity - are
running when a failover occurs, the new master will display the current
+state of the entity, but will not resume its provisioning or re-run any
partially completed tasks. In this case it will usually be necesarry
+to remove the node and reprovision it
+
+### Client Configuration
+It is the responsibility of the client to connect to the master Brooklyn
server. This can be accomplished in a variety of ways:
+
+* **Reverse Proxy**
+
+ To allow the client application to automatically fail over in the event
of a master server becoming unavailable, or the promotion of a new master,
+ a reverse proxy can be configured to route traffic depending on the
response returned by `http://<ip-address>:8081/v1/server/ha/state` (see above).
+ If a server returns `"MASTER"`, then traffic should be routed to that
server, otherwise it should not be. The client software should be configured
--- End diff --
Add that it could take 30 seconds for the standby to be promoted, so the
reverse proxy should retry for at least this period. Or that the failover time
should be reconfigured to be shorter.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---