[ 
https://issues.apache.org/jira/browse/MESOS-336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13972220#comment-13972220
 ] 

Bernd Mathiske commented on MESOS-336:
--------------------------------------

@Vinod: In general, we cannot encode a URI in a filename, because filenames 
have limited length (e.g. 255 chars) and URIs can be longer than that. There 
would be compression losses that could in principle lead to collisions. 

@Ben: OK, so you would want to trust the probability of collision to be small 
enough. Fair enough. Maybe use SHA-256 then. But how do you know what the 
checksum of the second URL in your example actually is without downloading it 
first? Also, I don't think that loading the same resource from multiple 
different URIs is an important use case.

Back to your previous comment above where you wrote " If the requested file 
exists". What identifies "the requested file"? Primarily, it's the URI, not its 
contents. Using a checksum flips this: the contents becomes the identity. So if 
 the framework presents the checksum in CommandInfo as described in MESOS-700 
instead of a URI, where does that checksum come from? The framework user would 
have to put it there. Or there would have to be a URI->checksum mapping that is 
derived from the first download. Here you have two choices. You can persist 
that mapping and have the fetcher write and read it. Or you can keep it in the 
slave. I am opting for the latter, because it does not require me to write all 
that I/O code for the mapping. But then it turns out that once you have any 
mapping, you might as well map URIs to cache files and the checksum becomes 
irrelevant...

This would all be easier if the URIs came with checksums pre-attached. Do they?




> Mesos slave should cache executors
> ----------------------------------
>
>                 Key: MESOS-336
>                 URL: https://issues.apache.org/jira/browse/MESOS-336
>             Project: Mesos
>          Issue Type: Improvement
>          Components: slave
>            Reporter: brian wickman
>            Assignee: Bernd Mathiske
>              Labels: newbie
>
> The slave should be smarter about how it handles pulling down executors.  In 
> our environment, executors rarely change but the slave will always pull it 
> down from regardless HDFS.  This puts undue stress on our HDFS clusters, and 
> is not resilient to reduced HDFS availability.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to