+1, sounds sensible to me Best.
On Tue, 22 May 2018 at 14:51 Duncan Grant <[email protected]> wrote: > Aled, > > +1 sounds like a sensible plan > > Duncan > > On Tue, 22 May 2018 at 13:59 Aled Sage <[email protected]> wrote: > > > Hi all, > > > > I'd like to change the default value of highAvailabilityMode from > > DISABLED to AUTO. > > > > Currently, if you start two Brooklyn servers pointing at the same > > persisted state (file-system directory or object store's bucket), then > > they are independent (because HA is 'disabled' by default). However, > > they both write to that same persisted state, which will lead to > > surprising behaviour, particularly when a Brooklyn server is next > > restarted. > > > > Changing to 'AUTO' would (almost entirely) have the same behaviour as we > > have currently for a single Brooklyn server. In the case of two servers > > pointing at the same persisted state, the second would come up as > > 'standby', and will be automatically promoted to 'master' if the first > > stops or fails. > > > > I say "almost entirely": > > 1. If you run Brooklyn and then kill it (e.g. `kill -9` or turn off the > > VM), when you start Brooklyn again it will wait to confirm the previous > > server is really dead. It waits for 30 seconds after the server's last > > heartbeat, by default. > > 2. The HA status shows all previous runs of the Brooklyn server (it gets > > a new node-id each time it restarts). This list will get longer and > > longer if you keep restarting Brooklyn, pointing at the same persisted > > state, until you clear out terminates instances from the list (via the > > UI or the REST api). > > 3. The logging at startup will be quite different (e.g. "Brooklyn > > initialisation (part two) complete" now means that the server has > > finished becoming the 'standby'. If anyone has tools/scripts that > > search/parse these logs, then they may be affected. > > > > --- > > > > Note the current behaviour contradicts the docs [1], which say: > > "Brooklyn will automatically run in HA mode if multiple Brooklyn > > instances are started pointing at the same persistence store." > > > > Thoughts? > > > > Aled > > > > p.s. another option would be to try to fail-fast when > > highAvailabilityMode is disabled but there is another Brooklyn using the > > same persisted state. However, distinguishing that from (1) above is > > tricky. > > > > [1] > > > > > https://github.com/apache/brooklyn-docs/blob/master/guide/ops/high-availability/index.md > > > > > > > -- Thomas Bouron • Senior Software Engineer @ Cloudsoft Corporation • https://cloudsoft.io/ Github: https://github.com/tbouron Twitter: https://twitter.com/eltibouron
