Hi Spark users,

I'm observing behavior where if a master node goes down for a restart, all
the worker JVMs die (in standalone cluster mode).  In other cluster
computing setups with master-worker relationships (namely Hadoop), if a
worker can't connect to the master or its connection drops it retries a few
times and then does exponential backoff or similar.

In Spark though the worker just dies.  Is the die-on-disconnect behavior
intentional or would people be ok with 3 retries 5sec apart and then
exponential backoff?

Andrew

Reply via email to