Re: Why rely on url scheme for fetching?

Tom Arnfeld Sat, 01 Nov 2014 04:34:41 -0700

Completely +1 to this. There are now quite a lot of hadoop compatible
filesystem wrappers out in the wild and this would certainly be very useful.


I'm happy to contribute a patch. Here's a few related issues that might be
of interest;

- https://issues.apache.org/jira/browse/MESOS-1887
- https://issues.apache.org/jira/browse/MESOS-1316
- https://issues.apache.org/jira/browse/MESOS-336
- https://issues.apache.org/jira/browse/MESOS-1248

On 31 October 2014 22:39, Tim Chen <[email protected]> wrote:

> I believe there is already a JIRA ticket for this, if you search for
> fetcher in Mesos JIRA I think you can find it.
>
> Tim
>
> On Fri, Oct 31, 2014 at 3:27 PM, Ankur Chauhan <[email protected]> wrote:
>
>> Hi,
>>
>> I have been looking at some of the stuff around the fetcher and saw
>> something interesting. The code for fetcher::fetch method is dependent on a
>> hard coded list of url schemes. No doubt that this works but is very
>> restrictive.
>> Hadoop/HDFS in general is pretty flexible when it comes to being able to
>> fetch stuff from urls and has the ability to fetch a large number of types
>> of urls and can be extended by adding configuration into the
>> conf/hdfs-site.xml and core-site.xml
>>
>> What I am proposing is that we refactor the fetcher.cpp to prefer to use
>> the hdfs (using hdfs/hdfs.hpp) to do all the fetching if HADOOP_HOME is set
>> and $HADOOP_HOME/bin/hadoop is available. This logic already exists and we
>> can just use it. The fallback logic for using net::download or local file
>> copy is may be left in place for installations that do not have hadoop
>> configured. This means that if hadoop is present we can directly fetch urls
>> such as tachyon://... snackfs:// ... cfs:// .... ftp://... s3://...
>> http:// ... file:// with no extra effort. This makes up for a much
>> better experience when it comes to debugging and extensibility.
>>
>> What do others think about this?
>>
>> - Ankur
>
>
>

Re: Why rely on url scheme for fetching?

Reply via email to