Thanks.

Not yet sure what the best way to support HA state fixes is. In first instance probably just displaying the failed items and having options to "rebind ignoring failures" and "try rebind again now" (where fixes are made out-of-band, directly in persistent store [1]).

Best
Alex

[1] CyberDuck is a great tool for working with object stores. It doesn't allow in-place viewing or editing but it does easily let you do bulk transfer. Note however for Softlayer there is a bug in latest v4.5, so use v4.4.5.



On 30/07/2014 09:29, Aled Sage wrote:
+1; all good suggestions.

For "HA state could be edited...", are you thinking that the rebind would pause at the error, allowing the state to be modified and rebind continue? Or more that one could look at the task errors, then download+fix the entity state, and then run rebind from the start again?

Aled


On 30/07/2014 05:45, Alex Heneveld wrote:
Hi folks-

As many of you know, when running Brooklyn if rebind fails the server responds safety-first by failing or declining to start. You then trawl through the logs, investigate the persisted state, resolve the issues, and restart it. Ideally this would be visible and resolvable within the server itself. To this end I'm thinking of:

* a new "maintenance mode" that the server would run in if there are problems in startup or failover * when in maintenance mode (or even HA standby mode), you are presented with a warning in the GUI but you can set an http session flag to allow access (if you have the entitlement) * a new "server" tab where server-level tasks are tracked (and other server operations such as shutdown, force failover, etc, could be sensibly re-housed) * all startup activities and HA activities are run as server-level tasks and visible in the server tab (which would allow you to see the reason for HA failures) * HA state could be edited in the server tab, when a server is in maintenance mode, to resolve problems and drive rebind

Is this a good evolution?

Best
Alex



Reply via email to