> *If* the 75 seconds is exceeded but we're within the recovery_timeout, > the slave *should* register with a new slave ID. The slave daemon (with > the new slave ID) reconnects to the old executors and updates them to use > the new slave ID. >
This is not true. 'recovery_timeout' was added to make sure that if a slave is down for a long time (>10 mins), the executors commit suicide. It is better for the executor/task to die than keep running because the framework might have already launched another replica of that instance. This was not tied to the 75s timeout (hard coded) because it is possible for a slave to successfully re-register with a master after 75s (e.g., both master and slave are down for 5 min). Also, a slave cannot connect to old executors with a new slave id. HTH,

