Hi all, I have been developing a custom recovery implementation for spark masters and workers using hazlecast clustering.
in the Spark worker code [1], we see that a list of masters needs to be provided at the worker start up, in order to achieve high availability. this effectively means that one should know the urls of possible masters, before spawning a worker. same applies to the spark context (Pls correct me if I'm wrong) In our implementation we are planning to add masters dynamically. for an example, say the elected leader goes down, then we would spawn another master in the cluster and all the workers connected to the previous master should connect to the newly spawned master. to do this, we need provision to dynamically update the master list of an already spawned worker. can this be achieved as of the current spark implementation? rgds [1] https://github.com/apache/spark/blob/v1.4.0/core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala#L538 -- Niranda @n1r44 <https://twitter.com/N1R44> https://pythagoreanscript.wordpress.com/