> When the leader receives a request to drain a worker, it must first mark > the worker as in the process to be drained i.e. blacklist the worker so > that no new assignments can be assigned to it. We can perhaps just save the > blacklist in memory. The worker should then create a new scheduling in > which the assignments of the worker to be drained are moved to other > workers perhaps in a round robin distribution. Afterwards, the leader > should mark the drain of the worker to be complete. > > There are some caveats to this approach. If the leader fails before > completing the drain request. The drain request will not be fulfilled. > However, if the client frequently checks the status of the drain, it should > notice that the drain is not running and can re-submit a request.
A couple of questions/comments. When a leader fails, does the new leader automatically create a new assignment, or does it continue with the assignment from the previous leader? Is the drain request a new concept in the model? I would suggest it would be better for the drain command to mark the worker as unschedulable (persisted). Then the check for whether draining is complete is whether the worker is doing any work (i.e. whether it has seen and processed the schedule that it's no longer a part of). This way there's no "drain" request to track as such. There's marking the worker as unschedulable, which is idempotent. The leader should work in a declarative rather than imperative fashion. i.e. it should generate the desired schedule, and the workers should work to match this schedule. This should avoid the leader failing issue. -Ivan