Hi, I see this has been asked before but has not gotten any satisfactory answer so I'll try again:
(here is the original thread I found: http://mail-archives.apache.org/mod_mbox/spark-user/201403.mbox/%3c1394044078706-2312.p...@n3.nabble.com%3E ) I have a set of workers dying and coming back again. The master prints the following warning: "Got heartbeat from unregistered worker ...." What is the solution to this -- rolling the master is very undesirable to me as I have a Shark context sitting on top of it (it's meant to be highly available). Insights appreciated -- I don't think an executor going down is very unexpected but it does seem odd that it won't be able to rejoin the working set. I'm running Spark 0.9.1 on CDH