So there are two issues currently I am looking into. The first is the
permissions of directories.  I'd still like to get the feelings from the
group on that because I've not managed to get Myriad/Yarn working on one
cluster (based on Ubuntu 14.04) but can't get it to work on another cluster
based on Red Hat 7.  It's strange from what I can tell everything is the
same, but the Redhat cluster complains about the the permissions
/etc/hadoop not being owned by root (it's owned by mapr:root but on the
ubuntu cluster that works fine with the same ownership!)  I do notice that
my build times reported by mapr are different.. but that may just be the
build for Redhat vs the build for Ubuntu?  Still digging into that one...

As to the hostname / mapr.host issue. I found a neat hack that may work for
folks

By setting this in my myriad config:

yarnEnvironment:

  YARN_HOME: hadoop-2.7.0

  YARN_NODEMANAGER_OPTS: "-Dnodemanager.resource.io-spindles=4.0
-Dmapr.host=$(hostname -f)"


I am able to get the mapr.host set back to be the correct hostname where
the nodemanager is running, this helps with a number of issues.  I thought
about this, and realized it would be better if I could get the hostname to
the createTTVolume script but use a unique name for the mount point (What
if I have multiple NMs on a single physical node?)

So, I tried:

YARN_NODEMANAGER_OPTS: "-Dnodemanager.resource.io-spindles=4.0
-Dmapr.host=$(basename `pwd`)"


Thinking that if I could get the directory name of the "run" in my sandbox,
I should be reasonably assured of uniqueness.  That seemed to work when the
nodemanager kicked off the the command:

/opt/mapr/server/createTTVolume.sh hadoopmapr2.brewingintel.com
/var/mapr/local/48833481-0c7a-4728-8f93-bcf9b545ad81/mapred
/var/mapr/local/48833481-0c7a-4728-8f93-bcf9b545ad81/mapred/nodeManager yarn


However, the script never returned, and the task failed.  So at this point,
I think we can get good info passed to the MapR script, but I am not sure
how the script or how mapr is creating the volume to work in conjunction
with Yarn.

To summarize: Right now I have Myriad working... but only on a Ubuntu Mesos
cluster. I have NOT changed my /tmp location for the Mesos slaves, I can't
get things working on my Redhat cluster, and I seemed to have found a hacky
workaround for the mapr.host.  Here is the script I am using to build my
tgz file...


#!/bin/bash


BASEDIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"


echo "#### Base Location: $BASEDIR"


################## BUILD VARIABLES


HADOOP_VER="hadoop-2.7.0"

HADOOP_BASE="/opt/mapr/hadoop"

HADOOP_HOME="${HADOOP_BASE}/${HADOOP_VER}"

MYRIAD_BUILD="/opt/mapr/myriad/incubator-myriad"

CONF_LOC="/mapr/brewpot/mesos/myriad/conf"

URI_LOC="/mapr/brewpot/mesos/myriad"




#################################

#Clean the working directory

echo "#### Cleaning Build DIR and old tgz"

sudo rm -rf ./${HADOOP_VER}

sudo rm -rf ./${HADOOP_VER}.tgz


#Copy a fresh copy of the hadoopz

echo "#### Copying Clean Build"


# I go here and tar with h and p. h is so all the symlinked items in the
MapR get put into the tgz and p to preserver permissions

cd $HADOOP_BASE

sudo tar zcfhp ${BASEDIR}/${HADOOP_VER}.tgz $HADOOP_VER


echo "#### Untaring New Build"

# I untar things in a new location so I can play without affecting the
"stock" install

cd $BASEDIR

sudo tar zxfp ${HADOOP_VER}.tgz


echo "#### Now remove source tgz to get ready for build"

sudo rm ${HADOOP_VER}.tgz




# This permission combination seems to work. I go and grab the
container-executor from the stock build so that I have the setuid version.

echo "#### Cleaning Base Build Logs, yarn-site, and permissions"

sudo rm $HADOOP_VER/etc/hadoop/yarn-site.xml

sudo rm -rf $HADOOP_VER/logs/*

sudo chown mapr:mapr ${HADOOP_VER}

sudo chown -R mapr:root ${HADOOP_VER}/*

sudo chown root:root ${HADOOP_VER}/etc/hadoop/container-executor.cfg

sudo cp --preserve ${HADOOP_HOME}/bin/container-executor ${HADOOP_VER}/bin/




#Copy the jars from Myriad into the Hadoop libs folders (You will need to
have Myriad build first with the root of your build being $MYRIAD_BUILD

echo "#### Copying Myriad Jars"

sudo cp $MYRIAD_BUILD/myriad-scheduler/build/libs/*.jar
$HADOOP_VER/share/hadoop/yarn/lib/

sudo cp $MYRIAD_BUILD/myriad-executor/build/libs/myriad-executor-0.1.0.jar
$HADOOP_VER/share/hadoop/yarn/lib/


#Address Configs


# First take the myriad-config-default.yml and put it into the
$HADOOP/etc/hadoop so it's in the tarball

echo "#### Updating myriad-config-default.yml"

sudo cp ${CONF_LOC}/myriad-config-default.yml ${HADOOP_VER}/etc/hadoop/


# Tar all the things with all the privs

echo "#### Tarring all the things"

sudo tar zcfhp ${HADOOP_VER}.tgz ${HADOOP_VER}/


# Copy to the URI location... note I am using MapR so I cp it directly to
the MapFS location via NFS share, it would probably be good to use a hadoop
copy command for interoperability

echo "#### Copying to HDFS Location"

cp ${HADOOP_VER}.tgz ${URI_LOC}/


# I do this because it worked... not sure if I remo

#sudo chown mapr:mapr ${URI_LOC}/${HADOOP_VER}.tgz


#echo "#### Cleaning unpacked location"

sudo rm -rf ./${HADOOP_VER}

sudo rm ./${HADOOP_VER}.tgz




On Wed, Nov 18, 2015 at 9:40 AM, yuliya Feldman <[email protected]
> wrote:

> I would love to have that piece of code "configurable", but it is not at
> the moment.
> Will send you patch offline.
> Thanks,Yuliya
>       From: John Omernik <[email protected]>
>  To: [email protected]; yuliya Feldman <[email protected]>
>  Sent: Wednesday, November 18, 2015 6:02 AM
>  Subject: Re: Struggling with Permissions
>
> Yuliya, I would be interested in the patch for MapR, is that a patch for
> Myriad or a patch for Hadoop on MapR?  I wonder if there is a hadoop env
> file I could modified in my TGZ to help address the issue on my nodes as
> well. Can you describe what "mapr.host" is and if I can force overwrite
> that in my ENV file or will MapR clobber that at a later point in
> execution? I am thinking that with some simple sed, I could "fix" the conf
> file.
>
> Wait, I suppose there is no way for me edit the command used to run the
> node manager... there's a thought. Could Myriad provide an ENV value or
> something that would allow us to edit the command or insert something into
> the command that is used to run the NM?  (below is the the command on my
> cluster)  Basically, if there was a way to template that  and alter it in
> the Myriad config, I could add commands to update the variables in the conf
> file before it's copied to yarn-site on every node... just spitballing
> ideas here...
>
>
>
> sudo tar -zxpf hadoop-2.7.0-NM.tar.gz && sudo chown mapr . && cp conf
> hadoop-2.7.0/etc/hadoop/yarn-site.xml; export YARN_HOME=hadoop-2.7.0; sudo
> -E -u mapr -H env YARN_HOME=hadoop-2.7.0
> YARN_NODEMANAGER_OPTS=-Dnodemanager.resource.io-spindles=4.0
> -Dyarn.resourcemanager.hostname=myriad.marathon.mesos
>
> -Dyarn.nodemanager.container-executor.class=org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor
> -Dnodemanager.resource.cpu-vcores=2 -Dnodemanager.resource.memory-mb=8192
> -Dmyriad.yarn.nodemanager.address=0.0.0.0:31984
> -Dmyriad.yarn.nodemanager.localizer.address=0.0.0.0:31233
> -Dmyriad.yarn.nodemanager.webapp.address=0.0.0.0:31716
> -Dmyriad.mapreduce.shuffle.port=31786  /bin/yarn nodemanager
>
>
>
> On Tue, Nov 17, 2015 at 4:44 PM, yuliya Feldman
> <[email protected]
> > wrote:
>
> > Hadoop (not Mapr) requires whole path starting from "/" be owned by root
> > and writable only by root
> > The second problem is exactly what I was talking about configuration
> being
> > taken from RM that overwrites local one
> > I can give you a patch to mitigate the issue for Mapr if you are building
> > from source.
> > Thanks,Yuliya
> >      From: John Omernik <[email protected]>
> >  To: [email protected]
> >  Sent: Tuesday, November 17, 2015 1:15 PM
> >  Subject: Re: Struggling with Permissions
> >
> > Well sure /tmp is world writeable but /tmp/mesos is not world writable
> thus
> > there is a sandbox to play in there... or am I missing something. Not to
> > mention my tmp is rwt which is world writable but only the creator or
> root
> > can modify (based on the googles).
> > Yuliya:
> >
> > I am seeing a weird behavior with MapR as it relates to (I believe) the
> > mapr_direct_shuffle.
> >
> > In the Node Manager logs, I see things starting and it saying "Checking
> for
> > local volume, if local volume is not present command will create and
> mount
> > it"
> >
> > Command invoked is : /opt/mapr/server/createTTVolume.sh
> > hadoopmapr7.brewingintel.com /var/mapr/local/
> > hadoopmapr2.brewingintel.com/mapred /var/mapr/local/
> > hadoopmapr2.brewingintel.com/mapred/nodeManager yarn
> >
> >
> > What is interesting here is hadoopmapr7 is the nodemanager it's trying to
> > start on, however the mount point it's trying to create is hadoopmapr2
> > which is the node the resource manager happened to fall on...  I was very
> > confused by that because in no place should hadoopmapr2 be "known" to the
> > nodemanager, because it thinks the resource manager hostname is
> > myriad.marathon.mesos.
> >
> > So why was it hard coding to the node the resource manager is running on?
> >
> > Well if I look at the conf file in the sandbox (the file that gets copied
> > to be yarn-site.xml for node managers.  There ARE four references the
> > hadoopmapr2. Three of the four say "source programatically" and one is
> just
> > set... that's mapr.host.  Could there be some down stream hinkyness going
> > on with how MapR is setting hostnames?  All of these variables seem
> "wrong"
> > in that mapr.host (on the node manager) should be hadoopmapr7 in this
> case,
> > and the resource managers should all be myriad.marathon.mesos.  I'd be
> > interested in your thoughts here, because I am stumped at how these are
> > getting set.
> >
> >
> >
> >
> >
> <property><name>yarn.resourcemanager.address</name><value>hadoopmapr2:8032</value><source>programatically</source></property>
> > <property><name>mapr.host</name><value>hadoopmapr2.brewingintel.com
> > </value></property>
> >
> >
> <property><name>yarn.resourcemanager.resource-tracker.address</name><value>hadoopmapr2:8031</value><source>programatically</source></property>
> >
> >
> <property><name>yarn.resourcemanager.admin.address</name><value>hadoopmapr2:8033</value><source>programatically</source></property>
> >
> >
> >
> >
> >
> >
> >
> > On Tue, Nov 17, 2015 at 2:51 PM, Darin Johnson <[email protected]>
> > wrote:
> >
> > > Yuliya: Are you referencing yarn.nodemanager.hostname or a mapr
> specific
> > > option?
> > >
> > > I'm working right now on passing a
> > > -Dyarn.nodemanager.hostname=offer.getHostName().  Useful if you've got
> > > extra ip's for a san or management network.
> > >
> > > John: Yeah the permissions on the tarball are a pain to get right.  I'm
> > > working on Docker Support and a build script for the tarball, which
> > should
> > > make things easier.  Also, to the point of using world writable
> > directories
> > > it's a little scary from the security side of things to allow
> executables
> > > to run there, especially things running as privileged users.  Many
> > distro's
> > > of linux will mount /tmp noexec.
> > >
> > > Darin
> > >
> > > On Tue, Nov 17, 2015 at 2:53 PM, yuliya Feldman
> > > <[email protected]
> > > > wrote:
> > >
> > > > Please change workdir directory for mesos slave to one that is not
> /tmp
> > > > and make sure that dir is owned by root.
> > > > There is one more caveat with binary distro and MapR - in Myriad code
> > for
> > > > binary distro configuration is copied from RM to NMs - it doe snot
> work
> > > for
> > > > MapR since we need hostname (yes for the sake of local volumes) to be
> > > > unique.
> > > > MapR will have Myriad release to handle this situation.
> > > >      From: John Omernik <[email protected]>
> > > >  To: [email protected]
> > > >  Sent: Tuesday, November 17, 2015 11:37 AM
> > > >  Subject: Re: Struggling with Permissions
> > > >
> > > > Oh hey, I found a post by me back on Sept 9.  I looked at the Jiras
> and
> > > > followed the instructions with the same errors. At this point do I
> > still
> > > > need to have a place where the entire path is owned by root? That
> seems
> > > > like a an odd requirement (a changed of each node to facilitate a
> > > > framework)
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Tue, Nov 17, 2015 at 1:25 PM, John Omernik <[email protected]>
> > wrote:
> > > >
> > > > > Hey all, I am struggling with permissions on myriad, trying to get
> > the
> > > > > right permissions in the tgz as well as who to run as.  I am
> running
> > in
> > > > > MapR, which means I need to run as mapr or root (otherwise my
> volume
> > > > > creation scripts will fail on MapR, MapR folks, we should talk more
> > > about
> > > > > those scripts)
> > > > >
> > > > > But back to the code, I've had lots issues. When I run the
> > > Frameworkuser
> > > > > and Superuser as mapr, it unpacks everything as MapR and I get a
> > > > > "/bin/container-executor" must be owned by root but is owned by 700
> > (my
> > > > > mapr UID).
> > > > >
> > > > > So now I am running as root, and I am getting the error below as it
> > > > > relates to /tmp. I am not sure which /tmp this refers to. the /tmp
> > that
> > > > my
> > > > > slave is executing in? (i.e. my local mesos agent /tmp directory)
> or
> > my
> > > > > MaprFS /tmp directory (both of which are world writable, as /tmp
> > > > typically
> > > > > is... or am I mistaken here?)
> > > > >
> > > > > Any thoughts on how to get this to resolve? This is when
> nodemanager
> > is
> > > > > trying to start running as root and root for both of my Myriad
> users.
> > > > >
> > > > > Thanks!
> > > > >
> > > > >
> > > > > Caused by: ExitCodeException exitCode=24: File /tmp must not be
> world
> > > or
> > > > group writable, but is 1777
> > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > >
> > > >
> > >
> >
> >
> >
> >
>
>
>
>

Reply via email to