Hi, I started spark standalone cluster with 4 nodes each running around 8 core. I want to scale it to even larger number of nodes.
The issue is that I am noticing lot of worker failure. I believe I am looking for a zookeeper based coordination where if a worker fails, another one is added from a pool of zookeeper host. I am building from the github master. Can I use the zookeeper patch for standalone cluster ? https://github.com/apache/incubator-spark/pull/19 Thanks. Deb
