> On Jan. 28, 2015, 2:47 a.m., Adam B wrote:
> > src/slave/containerizer/fetcher.cpp, lines 68-69
> > <https://reviews.apache.org/r/30037/diff/8/?file=833521#file833521line68>
> >
> >     What about fetching from ftp://, ftps, hdfs, hftp, s3, s3n, etc? This 
> > "Malformed URI" logic used to apply to ftp/ftps as well, but I'm guessing 
> > s3 and hdfs also wouldn't like fetching from a URI that doesn't have a '/', 
> > or ends in '/'. Not sure why the protocol string really matters when 
> > determining a basename for the cache file.
> 
> Bernd Mathiske wrote:
>     I do not know either. I just carried this code over from 
> launcher/fetcher.cpp, and I would like to see a clean, comprehensive solution 
> in stout. Hence the TODO. My new code should do exactly what the code before 
> it did in this regard.
> 
> Adam B wrote:
>     Not "exactly" what we were doing in 0.21.1. See 
> https://github.com/apache/mesos/blob/0.21.1/src/launcher/fetcher.cpp#L114
>     As I mentioned, this check also used to include ftp/ftps uris. Also, it 
> used to `LOG(ERROR) << "Malformed URL (missing path)";`
>     
>     You should update the comment to clarify what the previous bug/feature 
> intended. Here's my interpretation: 1) Check that there is at least one '/' 
> in the URI after the '://', since a fetch of "http://mesos.apache.org"; is not 
> a valid fetcher URI, it's just a website/hostname; 2) Check that the URI does 
> not end in a '/', since that's just a website/directory like 
> "http://mesos.apache.org/downloads/";. Granted, these URIs could just pull 
> down an index.html, but that's implicit behavior, and perhaps it's better to 
> explicitly fetch "http://mesos.apache.org/index.html"; if that's what you want.
>     I agree that we could do more intelligent URI parsing/verification, but 
> maybe we should just let hdfs or net::download check for those errors. I'm 
> not even sure why we originally added these checks. I traced them back to 
> 2012; you'll have to ask Flo:
>     commit ecf3b4a14d87c0eab2543f0fbc6c6efe208abf9c
>     Author: Benjamin Hindman <[email protected]>
>     Date:   Thu Mar 15 23:22:43 2012 +0000
>         Adding support to retrieve files (uri's) via HTTP/FTP (contributed by 
> Florian Leibert).
> 
> Adam B wrote:
>     Oh yeah, we need a '/' in the "path", and not as its last character, 
> since we want `path.substr(path.find_last_of("/") + 1)` to return a non-empty 
> string for basename(). Any other "invalid" characters will just pass through.

I looked through older code than the version I started out with, and indeed, 
there were indeed more protocols mentioned back then. I am bringing them back 
now. Thanks for spotting this!

It occurred to me that the way this check works, we do not need to enumerate a 
list of possible protocols, but we can check anything that starts with a short 
sequence of chars followed by "://" the same way, as this must be a URIof some 
sort. So I implemented this and no more protocols will be excluded.


- Bernd


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30037/#review69973
-----------------------------------------------------------


On Jan. 28, 2015, 9:33 a.m., Bernd Mathiske wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30037/
> -----------------------------------------------------------
> 
> (Updated Jan. 28, 2015, 9:33 a.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Till Toenshoff, and 
> Timothy Chen.
> 
> 
> Bugs: MESOS-2069
>     https://issues.apache.org/jira/browse/MESOS-2069
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Extends the fetcher info protobuf with "actions" (fetch directly, through 
> cache, retrieve from cache). Switches the basis for dealing with uris to 
> "items", which contain the uri, the action, and potentially and cache file 
> name. Refactors fetch() and run(), so there is only one of each. Introduces 
> about half of the actual cache logic, including a hashmap of cache file 
> objects for bookkeeping and basic operations on it. 
> 
> 
> Diffs
> -----
> 
>   include/mesos/fetcher/fetcher.proto 
> facb87b92bf3194516f636dcc348e136af537721 
>   src/launcher/fetcher.cpp fed0105946da579a38357a30e7ae56e646e05b89 
>   src/slave/containerizer/fetcher.hpp 
> 1db0eaf002c8d0eaf4e0391858e61e0912b35829 
>   src/slave/containerizer/fetcher.cpp 
> d290f95251def3952c5ee34f600e1d71467f6293 
> 
> Diff: https://reviews.apache.org/r/30037/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Bernd Mathiske
> 
>

Reply via email to