Aled, +1 sounds like a sensible plan
Duncan On Tue, 22 May 2018 at 13:59 Aled Sage <[email protected]> wrote: > Hi all, > > I'd like to change the default value of highAvailabilityMode from > DISABLED to AUTO. > > Currently, if you start two Brooklyn servers pointing at the same > persisted state (file-system directory or object store's bucket), then > they are independent (because HA is 'disabled' by default). However, > they both write to that same persisted state, which will lead to > surprising behaviour, particularly when a Brooklyn server is next > restarted. > > Changing to 'AUTO' would (almost entirely) have the same behaviour as we > have currently for a single Brooklyn server. In the case of two servers > pointing at the same persisted state, the second would come up as > 'standby', and will be automatically promoted to 'master' if the first > stops or fails. > > I say "almost entirely": > 1. If you run Brooklyn and then kill it (e.g. `kill -9` or turn off the > VM), when you start Brooklyn again it will wait to confirm the previous > server is really dead. It waits for 30 seconds after the server's last > heartbeat, by default. > 2. The HA status shows all previous runs of the Brooklyn server (it gets > a new node-id each time it restarts). This list will get longer and > longer if you keep restarting Brooklyn, pointing at the same persisted > state, until you clear out terminates instances from the list (via the > UI or the REST api). > 3. The logging at startup will be quite different (e.g. "Brooklyn > initialisation (part two) complete" now means that the server has > finished becoming the 'standby'. If anyone has tools/scripts that > search/parse these logs, then they may be affected. > > --- > > Note the current behaviour contradicts the docs [1], which say: > "Brooklyn will automatically run in HA mode if multiple Brooklyn > instances are started pointing at the same persistence store." > > Thoughts? > > Aled > > p.s. another option would be to try to fail-fast when > highAvailabilityMode is disabled but there is another Brooklyn using the > same persisted state. However, distinguishing that from (1) above is > tricky. > > [1] > > https://github.com/apache/brooklyn-docs/blob/master/guide/ops/high-availability/index.md > > >
