Jamie, I’m in Europe this week… so the timing of my responses are out of sync / delayed. There are 2 issues to work with here. The first is having a pluggable mesos fetcher… sounds like that is scheduled for 0.30. The other is what is available on dcos. Could you move that discussion to that mailing list? I will definitely work with you on getting this resolved.
ken > On May 10, 2016, at 3:45 PM, Briant, James <[email protected]> > wrote: > > Ok. Thanks Joseph. I will figure out how to get a more recent hadoop onto my > dcos agents then. > > Jamie > > From: Joseph Wu <[email protected] <mailto:[email protected]>> > Reply-To: "[email protected] <mailto:[email protected]>" > <[email protected] <mailto:[email protected]>> > Date: Tuesday, May 10, 2016 at 1:40 PM > To: user <[email protected] <mailto:[email protected]>> > Subject: Re: Enable s3a for fetcher > > I can't speak to what DCOS does or will do (you can ask on the associated > mailing list: [email protected] <mailto:[email protected]>). > > We will be maintaining existing functionality for the fetcher, which means > supporting the schemes: > * file > * http, https, ftp, ftps > * hdfs, hftp, s3, s3n <-- These rely on hadoop. > > And we will retain the --hadoop_home agent flag, which you can use to specify > the hadoop binary. > > Other schemes might work right now, if you hack around with your node setup. > But there's no guarantee that your hack will work between Mesos versions. In > future, we will associate a fetcher plugin for each scheme. And you will be > able to load custom fetcher plugins for additional schemes. > TLDR: no "nerfing" and less hackiness :) > > On Tue, May 10, 2016 at 12:58 PM, Briant, James > <[email protected] <mailto:[email protected]>> wrote: >> This is the mesos latest documentation: >> >> If the requested URI is based on some other protocol, then the fetcher tries >> to utilise a local Hadoop client and hence supports any protocol supported >> by the Hadoop client, e.g., HDFS, S3. See the slave configuration >> documentation <http://mesos.apache.org/documentation/latest/configuration/> >> for how to configure the slave with a path to the Hadoop client. [emphasis >> added] >> >> What you are saying is that dcos simply wont install hadoop on agents? >> >> Next question then: will you be nerfing fetcher.cpp, or will I be able to >> install hadoop on the agents myself, such that mesos will recognize s3a? >> >> >> From: Joseph Wu <[email protected] <mailto:[email protected]>> >> Reply-To: "[email protected] <mailto:[email protected]>" >> <[email protected] <mailto:[email protected]>> >> Date: Tuesday, May 10, 2016 at 12:20 PM >> To: user <[email protected] <mailto:[email protected]>> >> >> Subject: Re: Enable s3a for fetcher >> >> Mesos does not explicitly support HDFS and S3. Rather, Mesos will assume >> you have a hadoop binary and use it (blindly) for certain types of URIs. If >> the hadoop binary is not present, the mesos-fetcher will fail to fetch your >> HDFS or S3 URIs. >> >> Mesos does not ship/package hadoop, so these URIs are not expected to work >> out of the box (for plain Mesos distributions). In all cases, the operator >> must preconfigure hadoop on each node (similar to how Docker in Mesos works). >> >> Here's the epic tracking the modularization of the mesos-fetcher (I estimate >> it'll be done by 0.30): >> https://issues.apache.org/jira/browse/MESOS-3918 >> <https://issues.apache.org/jira/browse/MESOS-3918> >> >> ^ Once done, it should be easier to plug in more fetchers, such as one for >> your use-case. >> >> On Tue, May 10, 2016 at 11:21 AM, Briant, James >> <[email protected] <mailto:[email protected]>> wrote: >>> I’m happy to have default IAM role on the box that can read-only fetch from >>> my s3 bucket. s3a gets the credentials from AWS instance metadata. It works. >>> >>> If hadoop is gone, does that mean that hfds: URIs don’t work either? >>> >>> Are you saying dcos and mesos are diverging? Mesos explicitly supports hdfs >>> and s3. >>> >>> In the absence of S3, how do you propose I make large binaries available to >>> my cluster, and only to my cluster, on AWS? >>> >>> Jamie >>> >>> From: Cody Maloney <[email protected] <mailto:[email protected]>> >>> Reply-To: "[email protected] <mailto:[email protected]>" >>> <[email protected] <mailto:[email protected]>> >>> Date: Tuesday, May 10, 2016 at 10:58 AM >>> To: "[email protected] <mailto:[email protected]>" >>> <[email protected] <mailto:[email protected]>> >>> Subject: Re: Enable s3a for fetcher >>> >>> The s3 fetcher stuff inside of DC/OS is not supported. The `hadoop` binary >>> has been entirely removed from DC/OS 1.8 already. There have been various >>> proposals to make it so the mesos fetcher is much more pluggable / >>> extensible (https://issues.apache.org/jira/browse/MESOS-2731 >>> <https://issues.apache.org/jira/browse/MESOS-2731> for instance). >>> >>> Generally speaking people want a lot of different sorts of fetching, and >>> there are all sorts of questions of how to properly get auth to the various >>> chunks (if you're using s3a:// presumably you need to get credentials there >>> somehow. Otherwise you could just use http://). Need to design / build that >>> into Mesos and DC/OS to be able to use this stuff. >>> >>> Cody >>> >>> On Tue, May 10, 2016 at 9:55 AM Briant, James >>> <[email protected] <mailto:[email protected]>> >>> wrote: >>>> I want to use s3a: urls in fetcher. I’m using dcos 1.7 which has hadoop >>>> 2.5 on its agents. This version has the necessary hadoop-aws and aws-sdk: >>>> >>>> hadoop--afadb46fe64d0ee7ce23dbe769e44bfb0767a8b9]$ ls >>>> usr/share/hadoop/tools/lib/ | grep aws >>>> aws-java-sdk-1.7.4.jar >>>> hadoop-aws-2.5.0-cdh5.3.3.jar >>>> >>>> What config/scripts do I need to hack to get these guys on the classpath >>>> so that "hadoop fs -copyToLocal” works? >>>> >>>> Thanks, >>>> Jamie >> >

