Huadong Liu created MESOS-6221:

             Summary: Ability to post maintenance/schedule with better 
                 Key: MESOS-6221
             Project: Mesos
          Issue Type: Improvement
          Components: HTTP API
            Reporter: Huadong Liu

Currently the maintenance schedule update is at cluster granularity: "To update 
the maintenance schedule, the operator should first read the current schedule, 
make any necessary changes, and then post the modified schedule."

In contrast, the machine/down and up endpoints operate at host granularity. One 
or a set of hosts can be moved to DOWN mode or UP mode once the schedule exists.

Requiring to GET current schedule before POSTing an updated schedule may create 
races if machine/up and maintenance/schedule update happen at different 
hosts/processes, for example.

1. mesos master has host A in maintenance down mode.
2. process p1 tries to UP host A.
3. process p2 tries to get the current schedule and then append host B to the 
4. mesos master may end up have A and B in maintenance DRAIN mode although the 
desired result is to have B in DRAIN mode only.

I cannot find a document to explain why the maintenance schedule has to be 
updated at the cluster granularity. Although the problem can be resolved by 
external synchronization, having the ability to update maintenance schedule at 
hosts granularity seems a better choice.

This message was sent by Atlassian JIRA

Reply via email to