On Tue, Dec 3, 2013 at 12:43 PM, Andy Doan <[email protected]> wrote: > As we start to talk more concretely about high availability, I'm starting to > wonder if we should first ask "is it worth it?" ie - what could we expect > our availability to be if we just deployed a DB and a couple of web-servers. > If the answer is >98%, then is it worth the man-hours required to get us to > 99.9%?
Agreed. What components truly need to be under HA redundancy and which do we just restart when they fail? I'll argue that only the Projects Manager needs to be redundant. For the others I was hoping to backup any persistent storage to swift or even rely directly on swift for persistent data. Are there any patterns for doing this? I'm of the opinion that our time is better spent making our APIs capable of gracefully dealing with failure rather then making sure components don't fail. So, can we have a common strategy for when the PPA Assigner (for example) fails? What is monitoring the components for failure and standing them back up? How do consumers handle timeouts? What status is communicated? ...? Francis -- Francis Ginther Canonical - Ubuntu Engineering - Continuous Integration Team -- Mailing list: https://launchpad.net/~canonical-ci-engineering Post to : [email protected] Unsubscribe : https://launchpad.net/~canonical-ci-engineering More help : https://help.launchpad.net/ListHelp

