Re: Why rely on url scheme for fetching?

Timothy Chen Sat, 01 Nov 2014 19:26:11 -0700

Hi Ankur,

Can you post on reviewboard? We can discuss more about the code there.


Tim

Sent from my iPhone

> On Nov 1, 2014, at 6:29 PM, Ankur Chauhan <[email protected]> wrote:
> 
> Hi Tim,
> 
> I don't think there is an issue which is directly in line with what i wanted 
> but the closest one that I could find in JIRA is 
> https://issues.apache.org/jira/browse/MESOS-1711
> 
> I have a branch ( 
> https://github.com/ankurcha/mesos/compare/prefer_hadoop_fetcher ) that has a 
> change that would enable users to specify whatever hdfs compatible uris to 
> the mesos-fetcher but maybe you can weight in on it. Do you think this is the 
> right track? if so, i would like to pick this issue and submit a patch for 
> review.
> 
> -- Ankur
> 
> 
>> On 1 Nov 2014, at 04:32, Tom Arnfeld <[email protected]> wrote:
>> 
>> Completely +1 to this. There are now quite a lot of hadoop compatible 
>> filesystem wrappers out in the wild and this would certainly be very useful.
>> 
>> I'm happy to contribute a patch. Here's a few related issues that might be 
>> of interest;
>> 
>> - https://issues.apache.org/jira/browse/MESOS-1887
>> - https://issues.apache.org/jira/browse/MESOS-1316
>> - https://issues.apache.org/jira/browse/MESOS-336
>> - https://issues.apache.org/jira/browse/MESOS-1248
>> 
>>> On 31 October 2014 22:39, Tim Chen <[email protected]> wrote:
>>> I believe there is already a JIRA ticket for this, if you search for 
>>> fetcher in Mesos JIRA I think you can find it.
>>> 
>>> Tim
>>> 
>>>> On Fri, Oct 31, 2014 at 3:27 PM, Ankur Chauhan <[email protected]> wrote:
>>>> Hi,
>>>> 
>>>> I have been looking at some of the stuff around the fetcher and saw 
>>>> something interesting. The code for fetcher::fetch method is dependent on 
>>>> a hard coded list of url schemes. No doubt that this works but is very 
>>>> restrictive.
>>>> Hadoop/HDFS in general is pretty flexible when it comes to being able to 
>>>> fetch stuff from urls and has the ability to fetch a large number of types 
>>>> of urls and can be extended by adding configuration into the 
>>>> conf/hdfs-site.xml and core-site.xml
>>>> 
>>>> What I am proposing is that we refactor the fetcher.cpp to prefer to use 
>>>> the hdfs (using hdfs/hdfs.hpp) to do all the fetching if HADOOP_HOME is 
>>>> set and $HADOOP_HOME/bin/hadoop is available. This logic already exists 
>>>> and we can just use it. The fallback logic for using net::download or 
>>>> local file copy is may be left in place for installations that do not have 
>>>> hadoop configured. This means that if hadoop is present we can directly 
>>>> fetch urls such as tachyon://... snackfs:// ... cfs:// .... ftp://... 
>>>> s3://... http:// ... file:// with no extra effort. This makes up for a 
>>>> much better experience when it comes to debugging and extensibility.
>>>> 
>>>> What do others think about this?
>>>> 
>>>> - Ankur
>

Re: Why rely on url scheme for fetching?

Reply via email to