-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30626/
-----------------------------------------------------------
Review request for mesos, Adam B, Benjamin Hindman, Till Toenshoff, and Timothy
Chen.
Bugs: MESOS-2072
https://issues.apache.org/jira/browse/MESOS-2072
Repository: mesos
Description
-------
Fetcher cache eviction happens when the cache does not have enough space to
accomodate upcoming downloads to the cache. Necessary provisions included here:
- mesos-fetcher does not run until eviction was successful
- Cache space is reserved while (async) waiting for eviction to succeed. If it
fails, the reservation gets undone.
- Reservations can be partly from available space, partly from evictions. All
math included :-)
- To find out how much space is needed, downloading has a prelude in which we
query the download size from the URI. This works for all URI types that
mesos-fetcher currently supports, including http and hdfs.
- Size-determination requests are not synchronized and can be repeated. That is
deemed OK, since they are small. But downloading still is synchronized (by the
fetcher actor) since MESOS-2057. This avoids repeated downloads and potential
bandwidth choking.
- There is cleanup code for all kinds of error situations. Lists of URIs or
cache files are reached down as shared pointers to continuations, which can add
to these lists. At the very end of the fetch attempt, each list is processed
for undoing things like space reservations and eviction disabling.
- Eviction gets disabled for URIs that are currently in use, i.e. the related
cache files are. We use reference counting for this, since there may be
concurrent fetch attempts using the same cache files.
This patch depends on a series of smaller preparatory ones.
Diffs
-----
src/slave/containerizer/fetcher.hpp 1db0eaf002c8d0eaf4e0391858e61e0912b35829
src/slave/containerizer/fetcher.cpp d290f95251def3952c5ee34f600e1d71467f6293
Diff: https://reviews.apache.org/r/30626/diff/
Testing
-------
This still has a couple of minor bugs in the fetcher cache tests (not
necessarily in the code the tests run, just in the tests themselves). I posted
it already because the necessary changes should be minimal and reviews on this
code can start earlier this way. I will update and test shortly.
Thanks,
Bernd Mathiske