Hi Ted, Thanks for the pointers. I had a three node zookeeper setup . Now the master alone dies when a zookeeper instance is down and a new master is elected as leader and the cluster is up. But the master that was down , never comes up.
Is this the expected ? Is there a way to get alert when a master is down ? How to make sure that there is atleast one back up master is up always ? Thanks Vimal On Tue, Jun 28, 2016 at 7:24 PM, Ted Yu <yuzhih...@gmail.com> wrote: > Please see some blog w.r.t. the number of nodes in the quorum: > > > http://stackoverflow.com/questions/13022244/zookeeper-reliability-three-versus-five-nodes > > http://www.ibm.com/developerworks/library/bd-zookeeper/ > the paragraph starting with 'A quorum is represented by a strict > majority of nodes' > > FYI > > On Tue, Jun 28, 2016 at 5:52 AM, vimal dinakaran <vimal3...@gmail.com> > wrote: > >> I am using zookeeper for providing HA for spark cluster. We have two >> nodes zookeeper cluster. >> >> When one of the zookeeper dies then the entire spark cluster goes down . >> >> Is this expected behaviour ? >> Am I missing something in config ? >> >> Spark version - 1.6.1. >> Zookeeper version - 3.4.6 >> // spark-env.sh >> SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER >> -Dspark.deploy.zookeeper.url=zk1:2181,zk2:2181" >> >> Below is the log from spark master: >> ZooKeeperLeaderElectionAgent: We have lost leadership >> 16/06/27 09:39:30 ERROR Master: Leadership has been revoked -- master >> shutting down. >> >> Thanks >> Vimal >> >> >> >> >