I think I made a mistake...I would need at least N/2 + 1 nodes available all the time to reach quorum and be able do leader election within zookeeper ensemble
Given that I don't know ahead of time which availability zone is going to go down I guess I cant really tolerate one AZ going down within the two availability zones that are available in my DC. On Tue, Jan 24, 2017 at 5:37 PM, kant kodali <kanth...@gmail.com> wrote: > How many spark masters and zookeeper servers do I need to tolerate one > failure in one DC that has two availability zones ? Note: The one failure > that I want to tolerate can be in either availability zone. > > Here is my understanding so far. please correct me If I am wrong? > > for Zookeeper I would need 2F+1 server to tolerate F failures so in my > case that would be 3 however If one of the availability zone is down then I > would be left with only one zookeeper server (assuming the AZ that has two > zookeeper servers goes down) therefore I would need at least 4 zookeeper > servers two on each availability zone to tolerate a failure of one node and > to tolerate a failure of one availability zone. > > And the number of spark master(standalone mode) would be 2. one on each > availability zone such that if one of the spark master in one of > availability zones or the entire availability zone goes down the zookeeper > ensemble will be ale to elect the spark master. > > Is this correct so far? > > Thanks, > kant > > > > >