Hi, I've been making some modifications to the Hadoop framework recently and have come up against a brick wall. I'm wondering if the concept of killing an executor from a framework has been discussed before?
Currently we are launching two tasks for each Hadoop TaskTracker, one that has a bit of CPU and all the memory, and then another with the rest of the CPU. In total this equals the amount of resources we want to give each TaskTracker. This is *kind of* how spark works, ish. The reason we do this is to be able to free up CPU resources and remove slots from a TaskTracker (killing it half dead) but keeping the executor alive. At some undefined point in the future we then want to kill the executor, this happens by killing the other "control" task. This approach doesn't work very well in practice as a result of https://issues.apache.org/jira/browse/MESOS-1812 which means tasks are not launched in order on the slave, so there is no way to guarantee the control task comes up first, which leads to all sorts of interesting races. Is this is bad road to go down? I can't use framework messages as I don't believe those are a reliable way of sending signals, so not sure where else to turn. Cheers, Tom.
