Completely +1 to this. There are now quite a lot of hadoop compatible filesystem wrappers out in the wild and this would certainly be very useful.
I'm happy to contribute a patch. Here's a few related issues that might be of interest; - https://issues.apache.org/jira/browse/MESOS-1887 - https://issues.apache.org/jira/browse/MESOS-1316 - https://issues.apache.org/jira/browse/MESOS-336 - https://issues.apache.org/jira/browse/MESOS-1248 On 31 October 2014 22:39, Tim Chen <[email protected]> wrote: > I believe there is already a JIRA ticket for this, if you search for > fetcher in Mesos JIRA I think you can find it. > > Tim > > On Fri, Oct 31, 2014 at 3:27 PM, Ankur Chauhan <[email protected]> wrote: > >> Hi, >> >> I have been looking at some of the stuff around the fetcher and saw >> something interesting. The code for fetcher::fetch method is dependent on a >> hard coded list of url schemes. No doubt that this works but is very >> restrictive. >> Hadoop/HDFS in general is pretty flexible when it comes to being able to >> fetch stuff from urls and has the ability to fetch a large number of types >> of urls and can be extended by adding configuration into the >> conf/hdfs-site.xml and core-site.xml >> >> What I am proposing is that we refactor the fetcher.cpp to prefer to use >> the hdfs (using hdfs/hdfs.hpp) to do all the fetching if HADOOP_HOME is set >> and $HADOOP_HOME/bin/hadoop is available. This logic already exists and we >> can just use it. The fallback logic for using net::download or local file >> copy is may be left in place for installations that do not have hadoop >> configured. This means that if hadoop is present we can directly fetch urls >> such as tachyon://... snackfs:// ... cfs:// .... ftp://... s3://... >> http:// ... file:// with no extra effort. This makes up for a much >> better experience when it comes to debugging and extensibility. >> >> What do others think about this? >> >> - Ankur > > >

