Greg Mann created MESOS-9954:
--------------------------------

             Summary: Flapping tasks with large sandboxes can fill agent disk
                 Key: MESOS-9954
                 URL: https://issues.apache.org/jira/browse/MESOS-9954
             Project: Mesos
          Issue Type: Bug
            Reporter: Greg Mann


If a task on an agent is repeatedly re-launched after failing and pulls a large 
artifact into its sandbox, it can quickly fill the agent disk. This may happen 
on a time scale shorter than the disk watch interval, leading to the agent disk 
filling up.

We should evaluate solutions to this issue. A couple options:
* Perhaps an aggressive (short) disk watch interval is sufficient? We should 
investigate the performance impact of this approach.
* If the former doesn't work, then maybe polling free disk space whenever a 
task is launched makes sense? (Rate-limiting this might be necessary)
* Perhaps we can come up with some fundamentally different approach for 
detecting free disk space which would solve this issue?



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Reply via email to