Bernd Mathiske created MESOS-2072:
-------------------------------------
Summary: Fetcher cache eviction
Key: MESOS-2072
URL: https://issues.apache.org/jira/browse/MESOS-2072
Project: Mesos
Issue Type: Improvement
Components: fetcher, slave
Reporter: Bernd Mathiske
Assignee: Bernd Mathiske
Delete files from the fetcher cache so that a given cache size is never
exceeded. Succeed in doing so while concurrent downloads are on their way and
new requests are pouring in.
Idea: measure the size of each download before it begins, make enough room
before the download. This means that only download mechanisms that divulge the
size before the main download will be supported. AFAWK, those in use so far
have this property.
The calculation of how much space to free needs to be under concurrency
control, accumulating all space needed for competing, incomplete download
requests. (The Python script that performs fetcher caching for Aurora does not
seem to implement this. See
https://gist.github.com/zmanji/f41df77510ef9d00265a, imagine several of these
programs running concurrently, each one's _cache_eviction() call succeeding,
each perceiving the SAME free space being available.)
Ultimately, a conflict resolution strategy is needed if just the downloads
underway already exceed the cache capacity. Then, as a fallback, direct
download into the work directory will be used for some tasks. TBD how to pick
which task gets treated how.
At first, only support copying of any downloaded files to the work directory
for task execution. This isolates the task life cycle after starting a task
from cache eviction considerations.
(Later, we can add symbolic links that avoid copying. But then eviction of
fetched files used by ongoing tasks must be blocked, which adds complexity.
another future extension is MESOS-1667 "Extract from URI while downloading into
work dir").
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)