> On Nov. 4, 2014, 9:46 p.m., Vinod Kone wrote:
> > src/launcher/fetcher.cpp, line 223
> > <https://reviews.apache.org/r/27483/diff/7/?file=748389#file748389line223>
> >
> >     Why fall through if result.isError()?
> 
> Ankur Chauhan wrote:
>     The way I am thinking of the logic of fetchers is on the lines of "we try 
> all fetchers one by one till we succeed *even* if an unsuccessful fetcher can 
> handle it". This means that if for any reason (maybe mesos-fetcher is running 
> doesn't have permissions or whatever be the case) a fetcher fails we always 
> go to the next one. I understand that a case can be made to do something on 
> the lines of 
>     
>     ```
>     if(local.canHandle(uri)) {
>       return local.fetch(uri);
>     } else if (os_net.canHandle(uri)) {
>       return os_net.fetch(uri);
>     } else if ( hadoop.canHandle(uri)) {
>       return hadoop.fetch(uri);
>     } else {
>       return Error("Can't download uri:" + uri);
>     }
>     ```
>     
>     But this way we are assuming that local fetcher has all possible 
> permissions to handle local/file uris *and* os_net has all the configuration 
> to handle all http(s) | ftp/ftp(s) uris and takes away the fallback. It may 
> be the case that the hadoop fetcher (and hence hadoop client) has some 
> credentials in its config files that aren't present in the urls themselves. 
> An example of this is if ftp uris are provided without user or password but 
> core-site.xml has the required user/password in it.
>     
>     Does this answer your concern?
> 
> Vinod Kone wrote:
>     I'm concerned about how the logs will look for an user when fetching 
> URIs. If it's a non-local URI they would always see a message and failure 
> with local copy before using HDFS or Net, which is bad.
>     
>     I would much rather make it explicit what you are doing, with comments.
>     
>     ```
>     if (local.canHandle(uri) {
>       result = local.fetch(uri);
>       if (result.isSome()) {
>         return result;
>       } else {
>         LOG(WARNING) << "Failed to copy local file: " << result.Error() << ";
>       }
>     } else if (os_net.canHandle(uri)) {
>       result = os_net.fetch(uri);
>       if (result.isSome()) {
>         return result;
>       } else {
>         LOG(WARNING) << "Failed to download URI: " << result.Error() << ";
>       }
>     }
>     
>     // If we are here, one of the following is possible.
>     // 1. Failed to copy a local URI. 
>     // 2. Failed to download a remote URI using Os.net.
>     // 3. HDFS compatible URI.
>     // 4. Unexpected URI.
>     // For all these cases we try to fetch the URI using Hadoop client as a 
> fallback.
>     
>     return hadoop.fetch(uri);
>     
>     ```
>     
>     More importantly, are there really file permission issues where "cp" 
> doesn't work but "hadoop fs copyToLocal" works? Can you give me a concrete 
> example? I ask because if the hadoop client hangs for whatever reason (we 
> have seen this in production at Twitter), it will take upto the executor 
> registration timeout for the slave to kill the executor and transition the 
> task to LOST. Getting rid of hadoop dependency was one of the reasons we 
> install our executors locally. This change adds the dependency back on 
> hadoop, even for local files, which is unfortunate.
> 
> Ankur Chauhan wrote:
>     That does make sense, I hadn't considered the logging :-( and I am all up 
> for no hard external dependencies if I can avoid it. Let me think a little on 
> that.

The latest patch should take care of this. I am following the same approach as 
above.


- Ankur


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27483/#review59842
-----------------------------------------------------------


On Nov. 5, 2014, 6:39 a.m., Ankur Chauhan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27483/
> -----------------------------------------------------------
> 
> (Updated Nov. 5, 2014, 6:39 a.m.)
> 
> 
> Review request for mesos, Timothy Chen, Timothy St. Clair, and Vinod Kone.
> 
> 
> Bugs: MESOS-1711
>     https://issues.apache.org/jira/browse/MESOS-1711
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Previously, the fetcher used a hardcoded list of schemes to determine what 
> URIs could be fetched by hadoop (if available). This is now changed such that 
> we first check if hadoop can fetch them for us and then we fallback to the 
> os::net and then a local copy method (same as it used to be). This allows 
> users to fetch artifacts from arbitrary filesystems as long as hadoop is 
> correctly configured (in core-site.xml).
> 
> 
> Diffs
> -----
> 
>   src/hdfs/hdfs.hpp bbfeddef106c598d8379ced085ef0605c4b2f380 
>   src/launcher/fetcher.cpp 9323c28237010fa065ef34d74435c151ded530a8 
>   src/tests/fetcher_tests.cpp d7754009a59fedb43e3422c56b3a786ce80164aa 
> 
> Diff: https://reviews.apache.org/r/27483/diff/
> 
> 
> Testing
> -------
> 
> make check
> sudo bin/mesos-tests.sh --verbose
> support/mesos-style.py
> 
> 
> Thanks,
> 
> Ankur Chauhan
> 
>

Reply via email to