Re: Enable s3a for fetcher

Briant, James Tue, 10 May 2016 12:59:47 -0700

This is the mesos latest documentation:

If the requested URI is based on some other protocol, then the fetcher tries to 
utilise a local Hadoop client and hence supports any protocol supported by the 
Hadoop client, e.g., HDFS, S3. See the slave configuration 
documentation<http://mesos.apache.org/documentation/latest/configuration/> for 
how to configure the slave with a path to the Hadoop client. [emphasis added]


What you are saying is that dcos simply wont install hadoop on agents?

Next question then: will you be nerfing fetcher.cpp, or will I be able to 
install hadoop on the agents myself, such that mesos will recognize s3a?


From: Joseph Wu <[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Tuesday, May 10, 2016 at 12:20 PM
To: user <[email protected]<mailto:[email protected]>>
Subject: Re: Enable s3a for fetcher

Mesos does not explicitly support HDFS and S3.  Rather, Mesos will assume you 
have a hadoop binary and use it (blindly) for certain types of URIs.  If the 
hadoop binary is not present, the mesos-fetcher will fail to fetch your HDFS or 
S3 URIs.

Mesos does not ship/package hadoop, so these URIs are not expected to work out 
of the box (for plain Mesos distributions).  In all cases, the operator must 
preconfigure hadoop on each node (similar to how Docker in Mesos works).

Here's the epic tracking the modularization of the mesos-fetcher (I estimate 
it'll be done by 0.30):
https://issues.apache.org/jira/browse/MESOS-3918

^ Once done, it should be easier to plug in more fetchers, such as one for your 
use-case.

On Tue, May 10, 2016 at 11:21 AM, Briant, James 
<[email protected]<mailto:[email protected]>> wrote:
I’m happy to have default IAM role on the box that can read-only fetch from my 
s3 bucket. s3a gets the credentials from AWS instance metadata. It works.

If hadoop is gone, does that mean that hfds: URIs don’t work either?

Are you saying dcos and mesos are diverging? Mesos explicitly supports hdfs and 
s3.

In the absence of S3, how do you propose I make large binaries available to my 
cluster, and only to my cluster, on AWS?

Jamie

From: Cody Maloney <[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Tuesday, May 10, 2016 at 10:58 AM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Re: Enable s3a for fetcher

The s3 fetcher stuff inside of DC/OS is not supported. The `hadoop` binary has 
been entirely removed from DC/OS 1.8 already. There have been various proposals 
to make it so the mesos fetcher is much more pluggable / extensible 
(https://issues.apache.org/jira/browse/MESOS-2731 for instance).

Generally speaking people want a lot of different sorts of fetching, and there 
are all sorts of questions of how to properly get auth to the various chunks 
(if you're using s3a:// presumably you need to get credentials there somehow. 
Otherwise you could just use http://). Need to design / build that into Mesos 
and DC/OS to be able to use this stuff.

Cody

On Tue, May 10, 2016 at 9:55 AM Briant, James 
<[email protected]<mailto:[email protected]>> wrote:
I want to use s3a: urls in fetcher. I’m using dcos 1.7 which has hadoop 2.5 on 
its agents. This version has the necessary hadoop-aws and aws-sdk:

hadoop--afadb46fe64d0ee7ce23dbe769e44bfb0767a8b9]$ ls 
usr/share/hadoop/tools/lib/ | grep aws
aws-java-sdk-1.7.4.jar
hadoop-aws-2.5.0-cdh5.3.3.jar

What config/scripts do I need to hack to get these guys on the classpath so 
that "hadoop fs -copyToLocal” works?

Thanks,
Jamie

Re: Enable s3a for fetcher

Reply via email to