On Fri, Jun 19, 2015 at 3:46 PM, Vinod Kone <[email protected]> wrote:
> > *If* the 75 seconds is exceeded but we're within the recovery_timeout, >> the slave *should* register with a new slave ID. The slave daemon (with >> the new slave ID) reconnects to the old executors and updates them to use >> the new slave ID. >> > > This is not true. 'recovery_timeout' was added to make sure that if a > slave is down for a long time (>10 mins), the executors commit suicide. It > is better for the executor/task to die than keep running because the > framework might have already launched another replica of that instance. > This was not tied to the 75s timeout (hard coded) because it is possible > for a slave to successfully re-register with a master after 75s (e.g., both > master and slave are down for 5 min). > > Also, a slave cannot connect to old executors with a new slave id. > Perfect, thanks for the quick response Vinod!

