Re: Why rely on url scheme for fetching?

Ankur Chauhan Sat, 01 Nov 2014 18:31:29 -0700

Hi Tim,

I don't think there is an issue which is directly in line with what i wanted 
but the closest one that I could find in JIRA is 
https://issues.apache.org/jira/browse/MESOS-1711 
<https://issues.apache.org/jira/browse/MESOS-1711>


I have a branch ( 
https://github.com/ankurcha/mesos/compare/prefer_hadoop_fetcher 
<https://github.com/ankurcha/mesos/compare/prefer_hadoop_fetcher> ) that has a 
change that would enable users to specify whatever hdfs compatible uris to the 
mesos-fetcher but maybe you can weight in on it. Do you think this is the right 
track? if so, i would like to pick this issue and submit a patch for review.

-- Ankur


> On 1 Nov 2014, at 04:32, Tom Arnfeld <[email protected]> wrote:
> 
> Completely +1 to this. There are now quite a lot of hadoop compatible 
> filesystem wrappers out in the wild and this would certainly be very useful.
> 
> I'm happy to contribute a patch. Here's a few related issues that might be of 
> interest;
> 
> - https://issues.apache.org/jira/browse/MESOS-1887 
> <https://issues.apache.org/jira/browse/MESOS-1887>
> - https://issues.apache.org/jira/browse/MESOS-1316 
> <https://issues.apache.org/jira/browse/MESOS-1316>
> - https://issues.apache.org/jira/browse/MESOS-336 
> <https://issues.apache.org/jira/browse/MESOS-336>
> - https://issues.apache.org/jira/browse/MESOS-1248 
> <https://issues.apache.org/jira/browse/MESOS-1248>
> 
> On 31 October 2014 22:39, Tim Chen <[email protected] 
> <mailto:[email protected]>> wrote:
> I believe there is already a JIRA ticket for this, if you search for fetcher 
> in Mesos JIRA I think you can find it.
> 
> Tim
> 
> On Fri, Oct 31, 2014 at 3:27 PM, Ankur Chauhan <[email protected] 
> <mailto:[email protected]>> wrote:
> Hi,
> 
> I have been looking at some of the stuff around the fetcher and saw something 
> interesting. The code for fetcher::fetch method is dependent on a hard coded 
> list of url schemes. No doubt that this works but is very restrictive.
> Hadoop/HDFS in general is pretty flexible when it comes to being able to 
> fetch stuff from urls and has the ability to fetch a large number of types of 
> urls and can be extended by adding configuration into the conf/hdfs-site.xml 
> and core-site.xml
> 
> What I am proposing is that we refactor the fetcher.cpp to prefer to use the 
> hdfs (using hdfs/hdfs.hpp) to do all the fetching if HADOOP_HOME is set and 
> $HADOOP_HOME/bin/hadoop is available. This logic already exists and we can 
> just use it. The fallback logic for using net::download or local file copy is 
> may be left in place for installations that do not have hadoop configured. 
> This means that if hadoop is present we can directly fetch urls such as 
> tachyon://... snackfs:// ... cfs:// .... ftp://... s3://... http:// ... 
> file:// with no extra effort. This makes up for a much better experience when 
> it comes to debugging and extensibility.
> 
> What do others think about this?
> 
> - Ankur
> 
>

Re: Why rely on url scheme for fetching?

Reply via email to