Hi, I'm developing my own framework - that distributes >100 independent tasks across the cluster and just run them arbitrarily. My problem is, each task execution environment is a bit large tarball (2~6GB, mostly application jar files) and task itself finishes within 1~200 seconds, while tarball extraction takes like tens of seconds every time. Extracting the same tarball again and again in all tasks is a wasteful overhead that cannot be ignored.
Fetcher cache is great, but in my case, fetcher cache isn't even enough and I want to preserve all files extracted from the tarball while my executor is alive. If Mesos could cache all files extracted from the tarball by omitting not only download but extraction, I could save more time. In "Fetcher Cache Internals" [1] or in "Fetcher Cache" [2] section in the official document, such issues or future work is not mentioned - how do you solve this kind of extraction overhead problem, when you have rather large resource ? An option would be setting up an internal docker registry and let slaves cache the docker image that includes our jar files and save tarball extraction. But, I want to prevent our system from additional moving parts as much as I can. Another option might be let fetcher fetch all jar files independently in slaves, but I think it feasible, but I don't think it manageable in production in an easy way. PS Mesos is great; it is helping us a lot - I want to appreciate all the efforts by the community. Thank you so much! [1] http://mesos.apache.org/documentation/latest/fetcher-cache-internals/ [2] http://mesos.apache.org/documentation/latest/fetcher/ Kota UENISHI

