John, I'm not super familiar with MapR, but I think I might have some thought and the the MapR people can chime it :).
I think the mapr.host thing is due to the fact in the remote distribution, Myriad pulls it's config from the resource manager. As I mentioned in my note to Yuyila, I'm working on adding the ability to add yarn.nodemanager.hostname as a -D option, I think the right thing may by to expose an environment variable $HOSTNAME and then in yarnEnvironment: you could set a YARN_OPTS=-Dmapr.hostname=$HOSTNAME -Dyarn.nodemanager.hostname=$HOSTNAME ... option. One could imagine a similar option for ports as this is kind of what Marathon does. Maybe best to JIRA this, as I don't think we necessarily expose a lot of things we should just yet. On Tue, Nov 17, 2015 at 4:41 PM, John Omernik <[email protected]> wrote: > What's even stranger is I can't for life of me find where "mapr.host" gets > set or used. I did a grep -P -R "mapr\.host" ./* in /opt/mapr (which > included me pulling down the myriad code into > /opt/mapr/myriad/incubator-myriad) and found only one reference in > /opt/mapr/server/mapr_yarn_install.sh > > <property> > > <name>yarn.nodemanager.hostname</name> > > <value>\${mapr.host}</value> > > </property>" | sudo tee -a ${YARN_CONF_FILE} > > > But I don't think that is being called at all by the resource manager... > > > (Note when I create my tarball from /opt/mapr/hadoop/hadoop-2.7.0 directory > I am using tar -zcfhp to both preserver permissions and include the files > that symlinked... not sure if that affects things here.. ) > > > > > > On Tue, Nov 17, 2015 at 3:15 PM, John Omernik <[email protected]> wrote: > > > Well sure /tmp is world writeable but /tmp/mesos is not world writable > > thus there is a sandbox to play in there... or am I missing something. > Not > > to mention my tmp is rwt which is world writable but only the creator or > > root can modify (based on the googles). > > Yuliya: > > > > I am seeing a weird behavior with MapR as it relates to (I believe) the > > mapr_direct_shuffle. > > > > In the Node Manager logs, I see things starting and it saying "Checking > > for local volume, if local volume is not present command will create and > > mount it" > > > > Command invoked is : /opt/mapr/server/createTTVolume.sh > > hadoopmapr7.brewingintel.com /var/mapr/local/ > > hadoopmapr2.brewingintel.com/mapred /var/mapr/local/ > > hadoopmapr2.brewingintel.com/mapred/nodeManager yarn > > > > > > What is interesting here is hadoopmapr7 is the nodemanager it's trying to > > start on, however the mount point it's trying to create is hadoopmapr2 > > which is the node the resource manager happened to fall on... I was very > > confused by that because in no place should hadoopmapr2 be "known" to the > > nodemanager, because it thinks the resource manager hostname is > > myriad.marathon.mesos. > > > > So why was it hard coding to the node the resource manager is running on? > > > > Well if I look at the conf file in the sandbox (the file that gets copied > > to be yarn-site.xml for node managers. There ARE four references the > > hadoopmapr2. Three of the four say "source programatically" and one is > just > > set... that's mapr.host. Could there be some down stream hinkyness going > > on with how MapR is setting hostnames? All of these variables seem > "wrong" > > in that mapr.host (on the node manager) should be hadoopmapr7 in this > case, > > and the resource managers should all be myriad.marathon.mesos. I'd be > > interested in your thoughts here, because I am stumped at how these are > > getting set. > > > > > > > > > > > <property><name>yarn.resourcemanager.address</name><value>hadoopmapr2:8032</value><source>programatically</source></property> > > <property><name>mapr.host</name><value>hadoopmapr2.brewingintel.com > > </value></property> > > > > > <property><name>yarn.resourcemanager.resource-tracker.address</name><value>hadoopmapr2:8031</value><source>programatically</source></property> > > > > > <property><name>yarn.resourcemanager.admin.address</name><value>hadoopmapr2:8033</value><source>programatically</source></property> > > > > > > > > > > > > On Tue, Nov 17, 2015 at 2:51 PM, Darin Johnson <[email protected]> > > wrote: > > > >> Yuliya: Are you referencing yarn.nodemanager.hostname or a mapr specific > >> option? > >> > >> I'm working right now on passing a > >> -Dyarn.nodemanager.hostname=offer.getHostName(). Useful if you've got > >> extra ip's for a san or management network. > >> > >> John: Yeah the permissions on the tarball are a pain to get right. I'm > >> working on Docker Support and a build script for the tarball, which > should > >> make things easier. Also, to the point of using world writable > >> directories > >> it's a little scary from the security side of things to allow > executables > >> to run there, especially things running as privileged users. Many > >> distro's > >> of linux will mount /tmp noexec. > >> > >> Darin > >> > >> On Tue, Nov 17, 2015 at 2:53 PM, yuliya Feldman > >> <[email protected] > >> > wrote: > >> > >> > Please change workdir directory for mesos slave to one that is not > /tmp > >> > and make sure that dir is owned by root. > >> > There is one more caveat with binary distro and MapR - in Myriad code > >> for > >> > binary distro configuration is copied from RM to NMs - it doe snot > work > >> for > >> > MapR since we need hostname (yes for the sake of local volumes) to be > >> > unique. > >> > MapR will have Myriad release to handle this situation. > >> > From: John Omernik <[email protected]> > >> > To: [email protected] > >> > Sent: Tuesday, November 17, 2015 11:37 AM > >> > Subject: Re: Struggling with Permissions > >> > > >> > Oh hey, I found a post by me back on Sept 9. I looked at the Jiras > and > >> > followed the instructions with the same errors. At this point do I > still > >> > need to have a place where the entire path is owned by root? That > seems > >> > like a an odd requirement (a changed of each node to facilitate a > >> > framework) > >> > > >> > > >> > > >> > > >> > > >> > On Tue, Nov 17, 2015 at 1:25 PM, John Omernik <[email protected]> > wrote: > >> > > >> > > Hey all, I am struggling with permissions on myriad, trying to get > the > >> > > right permissions in the tgz as well as who to run as. I am running > >> in > >> > > MapR, which means I need to run as mapr or root (otherwise my volume > >> > > creation scripts will fail on MapR, MapR folks, we should talk more > >> about > >> > > those scripts) > >> > > > >> > > But back to the code, I've had lots issues. When I run the > >> Frameworkuser > >> > > and Superuser as mapr, it unpacks everything as MapR and I get a > >> > > "/bin/container-executor" must be owned by root but is owned by 700 > >> (my > >> > > mapr UID). > >> > > > >> > > So now I am running as root, and I am getting the error below as it > >> > > relates to /tmp. I am not sure which /tmp this refers to. the /tmp > >> that > >> > my > >> > > slave is executing in? (i.e. my local mesos agent /tmp directory) or > >> my > >> > > MaprFS /tmp directory (both of which are world writable, as /tmp > >> > typically > >> > > is... or am I mistaken here?) > >> > > > >> > > Any thoughts on how to get this to resolve? This is when nodemanager > >> is > >> > > trying to start running as root and root for both of my Myriad > users. > >> > > > >> > > Thanks! > >> > > > >> > > > >> > > Caused by: ExitCodeException exitCode=24: File /tmp must not be > world > >> or > >> > group writable, but is 1777 > >> > > > >> > > > >> > > > >> > > > >> > > >> > > >> > > >> > > >> > > > > >
