Greg Mann created MESOS-9954:

             Summary: Flapping tasks with large sandboxes can fill agent disk
                 Key: MESOS-9954
             Project: Mesos
          Issue Type: Bug
            Reporter: Greg Mann

If a task on an agent is repeatedly re-launched after failing and pulls a large 
artifact into its sandbox, it can quickly fill the agent disk. This may happen 
on a time scale shorter than the disk watch interval, leading to the agent disk 
filling up.

We should evaluate solutions to this issue. A couple options:
* Perhaps an aggressive (short) disk watch interval is sufficient? We should 
investigate the performance impact of this approach.
* If the former doesn't work, then maybe polling free disk space whenever a 
task is launched makes sense? (Rate-limiting this might be necessary)
* Perhaps we can come up with some fundamentally different approach for 
detecting free disk space which would solve this issue?

This message was sent by Atlassian Jira

Reply via email to