Hi Tim, I don't think there is an issue which is directly in line with what i wanted but the closest one that I could find in JIRA is https://issues.apache.org/jira/browse/MESOS-1711 <https://issues.apache.org/jira/browse/MESOS-1711>
I have a branch ( https://github.com/ankurcha/mesos/compare/prefer_hadoop_fetcher <https://github.com/ankurcha/mesos/compare/prefer_hadoop_fetcher> ) that has a change that would enable users to specify whatever hdfs compatible uris to the mesos-fetcher but maybe you can weight in on it. Do you think this is the right track? if so, i would like to pick this issue and submit a patch for review. -- Ankur > On 1 Nov 2014, at 04:32, Tom Arnfeld <[email protected]> wrote: > > Completely +1 to this. There are now quite a lot of hadoop compatible > filesystem wrappers out in the wild and this would certainly be very useful. > > I'm happy to contribute a patch. Here's a few related issues that might be of > interest; > > - https://issues.apache.org/jira/browse/MESOS-1887 > <https://issues.apache.org/jira/browse/MESOS-1887> > - https://issues.apache.org/jira/browse/MESOS-1316 > <https://issues.apache.org/jira/browse/MESOS-1316> > - https://issues.apache.org/jira/browse/MESOS-336 > <https://issues.apache.org/jira/browse/MESOS-336> > - https://issues.apache.org/jira/browse/MESOS-1248 > <https://issues.apache.org/jira/browse/MESOS-1248> > > On 31 October 2014 22:39, Tim Chen <[email protected] > <mailto:[email protected]>> wrote: > I believe there is already a JIRA ticket for this, if you search for fetcher > in Mesos JIRA I think you can find it. > > Tim > > On Fri, Oct 31, 2014 at 3:27 PM, Ankur Chauhan <[email protected] > <mailto:[email protected]>> wrote: > Hi, > > I have been looking at some of the stuff around the fetcher and saw something > interesting. The code for fetcher::fetch method is dependent on a hard coded > list of url schemes. No doubt that this works but is very restrictive. > Hadoop/HDFS in general is pretty flexible when it comes to being able to > fetch stuff from urls and has the ability to fetch a large number of types of > urls and can be extended by adding configuration into the conf/hdfs-site.xml > and core-site.xml > > What I am proposing is that we refactor the fetcher.cpp to prefer to use the > hdfs (using hdfs/hdfs.hpp) to do all the fetching if HADOOP_HOME is set and > $HADOOP_HOME/bin/hadoop is available. This logic already exists and we can > just use it. The fallback logic for using net::download or local file copy is > may be left in place for installations that do not have hadoop configured. > This means that if hadoop is present we can directly fetch urls such as > tachyon://... snackfs:// ... cfs:// .... ftp://... s3://... http:// ... > file:// with no extra effort. This makes up for a much better experience when > it comes to debugging and extensibility. > > What do others think about this? > > - Ankur > >

