So I have a bespoke framework that runs under 1.4.0 using the v1 HTTP API,
custom executor, checkpointing disabled.
When the framework is running happily and a new agent is added to the
cluster all the existing executors immediately get terminated.
The scheduler is told of the lost executors and tasks and then receives
offers about agents old and new and carries on normally.

I would expect however that the existing executors should keep running and
the scheduler should just receive offers about the new agent.
It's as if agent recovery is being performed when the new agent is launched
even though no old agent has exited.
Experiments thus far are with a cluster all on a single host, master on
127.0.0.1, agents have their own ip's and hostnames and ports.

Am I missing a configuration parameter?   Or is this correct behavior?

-Dan

Reply via email to