[ https://issues.apache.org/jira/browse/MESOS-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ian Downes updated MESOS-1667: ------------------------------ Target Version/s: 0.22.0 (was: 0.21.0) > Extract from URI while downloading into work dir > ------------------------------------------------ > > Key: MESOS-1667 > URL: https://issues.apache.org/jira/browse/MESOS-1667 > Project: Mesos > Issue Type: Improvement > Components: slave > Affects Versions: 0.20.0 > Environment: Every > Reporter: Bernd Mathiske > Labels: features, performance > Original Estimate: 96h > Remaining Estimate: 96h > > When the fetcher downloads an extractable archive, e.g. a tar file, it > currently downloads it completely and only then starts extracting from it. > But only the end result is needed for execution. Thus the space used for the > downloaded copy of the archive is wasted. This can become critical in case of > large archives. > The general idea to solve this issue is to perform the extraction while > downloading, and not storing intermediate results on disk. Possibly, this can > be achieved by arranging process pipes or by using some extraction library > code to stream the data through. > However, as a result of this, repeated downloading may always be called for, > whereas given an existing (https://reviews.apache.org/r/21316/) but not yet > committed patch for MESOS-336, the fetcher cache could just repeat the > extraction, without downloading more than once. Thus choosing in-stream > extraction might result in an overall performance loss. We should therefore > give users extra options in CommandInfo.URI to choose how to handle this. > In some cases, it could be possible to reuse the extracted assets directly, > also forgoing the repeat extraction. This could be handled with sym links. > Then extraction can happen during downloading and neither repeat downloading > nor repeat extraction occur. The user has to be conscious of the safety > issue, though, that any post-extraction modifications to the downloaded > assets are visible to subsequent tasks. So, an explicit flag in > CommandInfo.UIR is called for here, as well. > Ideally, this issue would be solved as a follow-up of MESOS-336, because some > of the described benefits depend on it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)