> On Jan. 28, 2015, 2:47 a.m., Adam B wrote: > > src/slave/containerizer/fetcher.hpp, line 98 > > <https://reviews.apache.org/r/30037/diff/8/?file=833520#file833520line98> > > > > This generates a unique pid, in case of multiple simultaneous fetchers? > > Bernd Mathiske wrote: > Each slave has its own fetcher, so we need unique fetcher pids. (You > could also get subtle bugs in recovery tests otherwise.)
Maybe it would make sense to name the fetcher after the slaveId, since those would already be unique? Then you could even tell which slave's fetcher you're looking at when debugging libprocess traces. > On Jan. 28, 2015, 2:47 a.m., Adam B wrote: > > src/slave/containerizer/fetcher.cpp, lines 68-69 > > <https://reviews.apache.org/r/30037/diff/8/?file=833521#file833521line68> > > > > What about fetching from ftp://, ftps, hdfs, hftp, s3, s3n, etc? This > > "Malformed URI" logic used to apply to ftp/ftps as well, but I'm guessing > > s3 and hdfs also wouldn't like fetching from a URI that doesn't have a '/', > > or ends in '/'. Not sure why the protocol string really matters when > > determining a basename for the cache file. > > Bernd Mathiske wrote: > I do not know either. I just carried this code over from > launcher/fetcher.cpp, and I would like to see a clean, comprehensive solution > in stout. Hence the TODO. My new code should do exactly what the code before > it did in this regard. Not "exactly" what we were doing in 0.21.1. See https://github.com/apache/mesos/blob/0.21.1/src/launcher/fetcher.cpp#L114 As I mentioned, this check also used to include ftp/ftps uris. Also, it used to `LOG(ERROR) << "Malformed URL (missing path)";` You should update the comment to clarify what the previous bug/feature intended. Here's my interpretation: 1) Check that there is at least one '/' in the URI after the '://', since a fetch of "http://mesos.apache.org" is not a valid fetcher URI, it's just a website/hostname; 2) Check that the URI does not end in a '/', since that's just a website/directory like "http://mesos.apache.org/downloads/". Granted, these URIs could just pull down an index.html, but that's implicit behavior, and perhaps it's better to explicitly fetch "http://mesos.apache.org/index.html" if that's what you want. I agree that we could do more intelligent URI parsing/verification, but maybe we should just let hdfs or net::download check for those errors. I'm not even sure why we originally added these checks. I traced them back to 2012; you'll have to ask Flo: commit ecf3b4a14d87c0eab2543f0fbc6c6efe208abf9c Author: Benjamin Hindman <[email protected]> Date: Thu Mar 15 23:22:43 2012 +0000 Adding support to retrieve files (uri's) via HTTP/FTP (contributed by Florian Leibert). > On Jan. 28, 2015, 2:47 a.m., Adam B wrote: > > src/slave/containerizer/fetcher.cpp, lines 478-480 > > <https://reviews.apache.org/r/30037/diff/8/?file=833521#file833521line478> > > > > So, cache_filename doesn't include the original basename at all? Just > > `c1234.tgz`? That makes it hard for an admin to debug something going wrong > > with the fetcher/cache. Do we at least log (even VLOG) somewhere that URL > > "foo" is being downloaded to the cache as "c1234.bar"? Maybe when we > > allocateCachefile > > Bernd Mathiske wrote: > Every download needs a unique filename. Options: > - wrap the download in a dir => do we run out of dir entries in the fs? > - concatenate the original basename => who knows how long that might be? > Combined too long? Might even run into path length limits. > - add part of the basename just for humans => I'll go for that! Say, up > to 20 chars? > > Yes, we log everything in stderr in the sandbox. Yeah, I like the human-readable prefix option. Makes more sense than just 'c'. - Adam ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30037/#review69973 ----------------------------------------------------------- On Jan. 28, 2015, 9:33 a.m., Bernd Mathiske wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/30037/ > ----------------------------------------------------------- > > (Updated Jan. 28, 2015, 9:33 a.m.) > > > Review request for mesos, Adam B, Benjamin Hindman, Till Toenshoff, and > Timothy Chen. > > > Bugs: MESOS-2069 > https://issues.apache.org/jira/browse/MESOS-2069 > > > Repository: mesos > > > Description > ------- > > Extends the fetcher info protobuf with "actions" (fetch directly, through > cache, retrieve from cache). Switches the basis for dealing with uris to > "items", which contain the uri, the action, and potentially and cache file > name. Refactors fetch() and run(), so there is only one of each. Introduces > about half of the actual cache logic, including a hashmap of cache file > objects for bookkeeping and basic operations on it. > > > Diffs > ----- > > include/mesos/fetcher/fetcher.proto > facb87b92bf3194516f636dcc348e136af537721 > src/launcher/fetcher.cpp fed0105946da579a38357a30e7ae56e646e05b89 > src/slave/containerizer/fetcher.hpp > 1db0eaf002c8d0eaf4e0391858e61e0912b35829 > src/slave/containerizer/fetcher.cpp > d290f95251def3952c5ee34f600e1d71467f6293 > > Diff: https://reviews.apache.org/r/30037/diff/ > > > Testing > ------- > > make check > > > Thanks, > > Bernd Mathiske > >
