[
https://issues.apache.org/jira/browse/MESOS-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Bernd Mathiske updated MESOS-1667:
----------------------------------
Assignee: (was: Bernd Mathiske)
> Extract from URI while downloading into work dir
> ------------------------------------------------
>
> Key: MESOS-1667
> URL: https://issues.apache.org/jira/browse/MESOS-1667
> Project: Mesos
> Issue Type: Improvement
> Components: fetcher, slave
> Affects Versions: 0.20.0
> Environment: Every
> Reporter: Bernd Mathiske
> Labels: features, mesosphere, performance
> Original Estimate: 96h
> Remaining Estimate: 96h
>
> When the fetcher downloads an extractable archive, e.g. a tar file, it
> currently downloads it completely and only then starts extracting from it.
> But only the end result is needed for execution. Thus the space used for the
> downloaded copy of the archive is wasted. This can become critical in case of
> large archives.
> The general idea to solve this issue is to perform the extraction while
> downloading, and not storing intermediate results on disk. Possibly, this can
> be achieved by arranging process pipes or by using some extraction library
> code to stream the data through.
> However, as a result of this, repeated downloading may always be called for,
> whereas given an existing (https://reviews.apache.org/r/21316/) but not yet
> committed patch for MESOS-336, the fetcher cache could just repeat the
> extraction, without downloading more than once. Thus choosing in-stream
> extraction might result in an overall performance loss. We should therefore
> give users extra options in CommandInfo.URI to choose how to handle this.
> In some cases, it could be possible to reuse the extracted assets directly,
> also forgoing the repeat extraction. This could be handled with sym links.
> Then extraction can happen during downloading and neither repeat downloading
> nor repeat extraction occur. The user has to be conscious of the safety
> issue, though, that any post-extraction modifications to the downloaded
> assets are visible to subsequent tasks. So, an explicit flag in
> CommandInfo.UIR is called for here, as well.
> Ideally, this issue would be solved as a follow-up of MESOS-336, because some
> of the described benefits depend on it.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)