Re: Enable s3a for fetcher

Ken Sipe Wed, 11 May 2016 00:41:50 -0700

Jamie,

I’m in Europe this week… so the timing of my responses are out of sync / 
delayed.   There are 2 issues to work with here.  The first is having a 
pluggable mesos fetcher… sounds like that is scheduled for 0.30.   The other is 
what is available on dcos.  Could you move that discussion to that mailing 
list?  I will definitely work with you on getting this resolved.


ken
> On May 10, 2016, at 3:45 PM, Briant, James <[email protected]> 
> wrote:
> 
> Ok. Thanks Joseph. I will figure out how to get a more recent hadoop onto my 
> dcos agents then.
> 
> Jamie
> 
> From: Joseph Wu <[email protected] <mailto:[email protected]>>
> Reply-To: "[email protected] <mailto:[email protected]>" 
> <[email protected] <mailto:[email protected]>>
> Date: Tuesday, May 10, 2016 at 1:40 PM
> To: user <[email protected] <mailto:[email protected]>>
> Subject: Re: Enable s3a for fetcher
> 
> I can't speak to what DCOS does or will do (you can ask on the associated 
> mailing list: [email protected] <mailto:[email protected]>).
> 
> We will be maintaining existing functionality for the fetcher, which means 
> supporting the schemes:
> * file
> * http, https, ftp, ftps
> * hdfs, hftp, s3, s3n  <--  These rely on hadoop.
> 
> And we will retain the --hadoop_home agent flag, which you can use to specify 
> the hadoop binary.
> 
> Other schemes might work right now, if you hack around with your node setup.  
> But there's no guarantee that your hack will work between Mesos versions.  In 
> future, we will associate a fetcher plugin for each scheme.  And you will be 
> able to load custom fetcher plugins for additional schemes.
> TLDR: no "nerfing" and less hackiness :)
> 
> On Tue, May 10, 2016 at 12:58 PM, Briant, James 
> <[email protected] <mailto:[email protected]>> wrote:
>> This is the mesos latest documentation:
>> 
>> If the requested URI is based on some other protocol, then the fetcher tries 
>> to utilise a local Hadoop client and hence supports any protocol supported 
>> by the Hadoop client, e.g., HDFS, S3. See the slave configuration 
>> documentation <http://mesos.apache.org/documentation/latest/configuration/> 
>> for how to configure the slave with a path to the Hadoop client. [emphasis 
>> added]
>> 
>> What you are saying is that dcos simply wont install hadoop on agents?
>> 
>> Next question then: will you be nerfing fetcher.cpp, or will I be able to 
>> install hadoop on the agents myself, such that mesos will recognize s3a?
>> 
>> 
>> From: Joseph Wu <[email protected] <mailto:[email protected]>>
>> Reply-To: "[email protected] <mailto:[email protected]>" 
>> <[email protected] <mailto:[email protected]>>
>> Date: Tuesday, May 10, 2016 at 12:20 PM
>> To: user <[email protected] <mailto:[email protected]>>
>> 
>> Subject: Re: Enable s3a for fetcher
>> 
>> Mesos does not explicitly support HDFS and S3.  Rather, Mesos will assume 
>> you have a hadoop binary and use it (blindly) for certain types of URIs.  If 
>> the hadoop binary is not present, the mesos-fetcher will fail to fetch your 
>> HDFS or S3 URIs.
>> 
>> Mesos does not ship/package hadoop, so these URIs are not expected to work 
>> out of the box (for plain Mesos distributions).  In all cases, the operator 
>> must preconfigure hadoop on each node (similar to how Docker in Mesos works).
>> 
>> Here's the epic tracking the modularization of the mesos-fetcher (I estimate 
>> it'll be done by 0.30):
>> https://issues.apache.org/jira/browse/MESOS-3918 
>> <https://issues.apache.org/jira/browse/MESOS-3918>
>> 
>> ^ Once done, it should be easier to plug in more fetchers, such as one for 
>> your use-case.
>> 
>> On Tue, May 10, 2016 at 11:21 AM, Briant, James 
>> <[email protected] <mailto:[email protected]>> wrote:
>>> I’m happy to have default IAM role on the box that can read-only fetch from 
>>> my s3 bucket. s3a gets the credentials from AWS instance metadata. It works.
>>> 
>>> If hadoop is gone, does that mean that hfds: URIs don’t work either?
>>> 
>>> Are you saying dcos and mesos are diverging? Mesos explicitly supports hdfs 
>>> and s3.
>>> 
>>> In the absence of S3, how do you propose I make large binaries available to 
>>> my cluster, and only to my cluster, on AWS?
>>> 
>>> Jamie
>>> 
>>> From: Cody Maloney <[email protected] <mailto:[email protected]>>
>>> Reply-To: "[email protected] <mailto:[email protected]>" 
>>> <[email protected] <mailto:[email protected]>>
>>> Date: Tuesday, May 10, 2016 at 10:58 AM
>>> To: "[email protected] <mailto:[email protected]>" 
>>> <[email protected] <mailto:[email protected]>>
>>> Subject: Re: Enable s3a for fetcher
>>> 
>>> The s3 fetcher stuff inside of DC/OS is not supported. The `hadoop` binary 
>>> has been entirely removed from DC/OS 1.8 already. There have been various 
>>> proposals to make it so the mesos fetcher is much more pluggable / 
>>> extensible (https://issues.apache.org/jira/browse/MESOS-2731 
>>> <https://issues.apache.org/jira/browse/MESOS-2731> for instance). 
>>> 
>>> Generally speaking people want a lot of different sorts of fetching, and 
>>> there are all sorts of questions of how to properly get auth to the various 
>>> chunks (if you're using s3a:// presumably you need to get credentials there 
>>> somehow. Otherwise you could just use http://). Need to design / build that 
>>> into Mesos and DC/OS to be able to use this stuff.
>>> 
>>> Cody
>>> 
>>> On Tue, May 10, 2016 at 9:55 AM Briant, James 
>>> <[email protected] <mailto:[email protected]>> 
>>> wrote:
>>>> I want to use s3a: urls in fetcher. I’m using dcos 1.7 which has hadoop 
>>>> 2.5 on its agents. This version has the necessary hadoop-aws and aws-sdk:
>>>> 
>>>> hadoop--afadb46fe64d0ee7ce23dbe769e44bfb0767a8b9]$ ls 
>>>> usr/share/hadoop/tools/lib/ | grep aws
>>>> aws-java-sdk-1.7.4.jar
>>>> hadoop-aws-2.5.0-cdh5.3.3.jar
>>>> 
>>>> What config/scripts do I need to hack to get these guys on the classpath 
>>>> so that "hadoop fs -copyToLocal” works?
>>>> 
>>>> Thanks,
>>>> Jamie
>> 
>

Re: Enable s3a for fetcher

Reply via email to