----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30037/#review69973 -----------------------------------------------------------
Lots of feedback. Thanks for your patience. include/mesos/fetcher/fetcher.proto <https://reviews.apache.org/r/30037/#comment114834> "directly" is a little too vague for me. DOWNLOAD_INTO_WORK_DIR? BYPASS_CACHE? include/mesos/fetcher/fetcher.proto <https://reviews.apache.org/r/30037/#comment114835> CacheAction? include/mesos/fetcher/fetcher.proto <https://reviews.apache.org/r/30037/#comment114836> Nit: Don't abbreviate 'dir' when you have plenty of room and the variable itself doesn't abbreviate. (For all I know, here 'dir' could be short for 'directly') include/mesos/fetcher/fetcher.proto <https://reviews.apache.org/r/30037/#comment114837> Is 'Item' too vague? I can't think of another term that isn't already overloaded (resource, file, uri, binary, object), so perhaps 'Item' is the clearest after all. src/slave/containerizer/fetcher.hpp <https://reviews.apache.org/r/30037/#comment114838> This generates a unique pid, in case of multiple simultaneous fetchers? src/slave/containerizer/fetcher.hpp <https://reviews.apache.org/r/30037/#comment114839> Is the custom output redirection still supported, or was it never used? Can we be sure that Twitter or another user isn't using the redirection for their own custom fetcher? src/slave/containerizer/fetcher.hpp <https://reviews.apache.org/r/30037/#comment114900> Why does FetcherProcess need to be a friend class? Seems like it shouldn't be necessary. src/slave/containerizer/fetcher.hpp <https://reviews.apache.org/r/30037/#comment114840> Option vs. Try? src/slave/containerizer/fetcher.cpp <https://reviews.apache.org/r/30037/#comment114844> I'm assuming this gets used for something other than extension() in a future patch, right? src/slave/containerizer/fetcher.cpp <https://reviews.apache.org/r/30037/#comment114843> What about fetching from ftp://, ftps, hdfs, hftp, s3, s3n, etc? This "Malformed URI" logic used to apply to ftp/ftps as well, but I'm guessing s3 and hdfs also wouldn't like fetching from a URI that doesn't have a '/', or ends in '/'. Not sure why the protocol string really matters when determining a basename for the cache file. src/slave/containerizer/fetcher.cpp <https://reviews.apache.org/r/30037/#comment114845> Where does the user validation happen? Wouldn't want to create a directory with a username that isn't a valid directory_name. Wouldn't want to download anything if we didn't have permissions for that user anyway. src/slave/containerizer/fetcher.cpp <https://reviews.apache.org/r/30037/#comment114846> Our TODO style prefers the TODO immediately following a "// " and immediately followed by ":", as in: `// TODO(username): Text.` Please restate your TODO as above on a separate line, as a statement ending in punctuation. This separates a comment about current behavior from a TODO statement. src/slave/containerizer/fetcher.cpp <https://reviews.apache.org/r/30037/#comment114848> Always prefer CopyFrom unless you know you need MergeFrom. src/slave/containerizer/fetcher.cpp <https://reviews.apache.org/r/30037/#comment114864> I think I'd feel better if allocateCacheFile was on a separate name from set_cache_filename, because as it is now, it almost looks like you're just calling an accessor, but you're actually creating a CacheFile and adding it to the `cache` hashmap, then accessing the name. I'm of the opinion that anything with a notable side effect should get its own line. src/slave/containerizer/fetcher.cpp <https://reviews.apache.org/r/30037/#comment114870> Probably also works as CopyFrom. src/slave/containerizer/fetcher.cpp <https://reviews.apache.org/r/30037/#comment114875> We usually use CHECKs to assert expected invariants about incoming parameters' values or expected class state. This CHECK just verifies that the logic above in this function is valid. `foreach(uri : uris) { add_items(item); }` should always result in the same size arrays. src/slave/containerizer/fetcher.cpp <https://reviews.apache.org/r/30037/#comment114891> You don't need to specify `process::` since you're already `using process::Future` above. src/slave/containerizer/fetcher.cpp <https://reviews.apache.org/r/30037/#comment114895> Why would you need to erase the pid here if it's already done in checkRunStatus? src/slave/containerizer/fetcher.cpp <https://reviews.apache.org/r/30037/#comment114894> Could this ever error? src/slave/containerizer/fetcher.cpp <https://reviews.apache.org/r/30037/#comment114897> What if somebody still wants to set custom FDs for out/err? src/slave/containerizer/fetcher.cpp <https://reviews.apache.org/r/30037/#comment114898> I can't tell if these are lined up correctly, but they look off in RB. src/slave/containerizer/fetcher.cpp <https://reviews.apache.org/r/30037/#comment114899> Please also pass HADOOP_HOME. Spell out "environment variable"? src/slave/containerizer/fetcher.cpp <https://reviews.apache.org/r/30037/#comment114863> Please comment about whether this returns the '.' as part of the extension string src/slave/containerizer/fetcher.cpp <https://reviews.apache.org/r/30037/#comment114901> Why increment before using? Do you have something against 0-based indexing? src/slave/containerizer/fetcher.cpp <https://reviews.apache.org/r/30037/#comment114902> Which filesystems don't accept files starting with a digit? Can you give an example in the comment? src/slave/containerizer/fetcher.cpp <https://reviews.apache.org/r/30037/#comment114904> So, cache_filename doesn't include the original basename at all? Just `c1234.tgz`? That makes it hard for an admin to debug something going wrong with the fetcher/cache. Do we at least log (even VLOG) somewhere that URL "foo" is being downloaded to the cache as "c1234.bar"? Maybe when we allocateCachefile src/slave/containerizer/fetcher.cpp <https://reviews.apache.org/r/30037/#comment114862> No delimiter between user and uri? src/slave/containerizer/fetcher.cpp <https://reviews.apache.org/r/30037/#comment114905> Please wrap after the '=' src/slave/containerizer/fetcher.cpp <https://reviews.apache.org/r/30037/#comment114906> You really shouldn't be setting the Promise from outside the CacheFile (where it is private). Just add CacheFile::completed(cacheDir) and have it set the promise. - Adam B On Jan. 28, 2015, 12:54 a.m., Bernd Mathiske wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/30037/ > ----------------------------------------------------------- > > (Updated Jan. 28, 2015, 12:54 a.m.) > > > Review request for mesos, Adam B, Benjamin Hindman, Till Toenshoff, and > Timothy Chen. > > > Bugs: MESOS-2069 > https://issues.apache.org/jira/browse/MESOS-2069 > > > Repository: mesos-git > > > Description > ------- > > Extends the fetcher info protobuf with "actions" (fetch directly, through > cache, retrieve from cache). Switches the basis for dealing with uris to > "items", which contain the uri, the action, and potentially and cache file > name. Refactors fetch() and run(), so there is only one of each. Introduces > about half of the actual cache logic, including a hashmap of cache file > objects for bookkeeping and basic operations on it. > > > Diffs > ----- > > include/mesos/fetcher/fetcher.proto > facb87b92bf3194516f636dcc348e136af537721 > src/launcher/fetcher.cpp fed0105946da579a38357a30e7ae56e646e05b89 > src/slave/containerizer/fetcher.hpp > 1db0eaf002c8d0eaf4e0391858e61e0912b35829 > src/slave/containerizer/fetcher.cpp > d290f95251def3952c5ee34f600e1d71467f6293 > > Diff: https://reviews.apache.org/r/30037/diff/ > > > Testing > ------- > > make check > > > Thanks, > > Bernd Mathiske > >
