On Thu, Sep 30, 2010 at 11:35 AM, Alan Cox <[email protected]> wrote: >> Reliable HA is hard and a lot of work to get right. That's why products >> like IBM Tivoli Systems Automation for Multiplatform exist. > > Reliable full HA is also usually very expensive and it's easy to get > sucked into the 'and if we add this we get XX benefit' catalogue > shopping or sales spiel. > > Start with a risk assessment not a catalogue. If you don't know the cost > of a failure, a guess at fail rates or the expected recovery time without > HA you don't know how much you need to improve and what the justifiable > spend is.
:-) Very true! We looked at a scenario that cut the 3 minute recovery for the application by approximately 5 seconds... (and time to detect the failure was upto 30 seconds). But it would double the resource consumption (memory, in this case). And we did not even factor in the software cost for TSA yet... > Sometimes a little bit of redesign to remove the HA requirement > from a system can be a lot cheaper than HA, often however with shared > consistent data it's not easy or not possible. The 90/10 rule applies. Getting rid of that last 10% of recovery time is expensive. Going from "pretty quick" recovery to "un-interrupted service" may take a lot... Planned outages is yet another area. It also helps to identify the possible failures and implications. A lot of the scenarios from discrete servers seem to build on "when anything happens with X we switch to Y" But when you run both as Linux guest on z/VM, not only is some "anything" less likely to happen, it also means that in remaining cases Y may not solve the problem that affected X. Our brain storm sessions often ended in "well if that happens, our customers would have other things to worry about" ;-) As long as we talk application issues, Y could probably be started instead of X and not double resources like you do with discrete servers. | Rob ---------------------------------------------------------------------- For LINUX-390 subscribe / signoff / archive access instructions, send email to [email protected] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 ---------------------------------------------------------------------- For more information on Linux on System z, visit http://wiki.linuxvm.org/
