What's even stranger is I can't for life of me find where "mapr.host" gets
set or used. I did a grep -P -R "mapr\.host" ./* in /opt/mapr (which
included me pulling down the myriad code into
/opt/mapr/myriad/incubator-myriad) and found only one reference in
/opt/mapr/server/mapr_yarn_install.sh
<property>
<name>yarn.nodemanager.hostname</name>
<value>\${mapr.host}</value>
</property>" | sudo tee -a ${YARN_CONF_FILE}
But I don't think that is being called at all by the resource manager...
(Note when I create my tarball from /opt/mapr/hadoop/hadoop-2.7.0 directory
I am using tar -zcfhp to both preserver permissions and include the files
that symlinked... not sure if that affects things here.. )
On Tue, Nov 17, 2015 at 3:15 PM, John Omernik <[email protected]> wrote:
> Well sure /tmp is world writeable but /tmp/mesos is not world writable
> thus there is a sandbox to play in there... or am I missing something. Not
> to mention my tmp is rwt which is world writable but only the creator or
> root can modify (based on the googles).
> Yuliya:
>
> I am seeing a weird behavior with MapR as it relates to (I believe) the
> mapr_direct_shuffle.
>
> In the Node Manager logs, I see things starting and it saying "Checking
> for local volume, if local volume is not present command will create and
> mount it"
>
> Command invoked is : /opt/mapr/server/createTTVolume.sh
> hadoopmapr7.brewingintel.com /var/mapr/local/
> hadoopmapr2.brewingintel.com/mapred /var/mapr/local/
> hadoopmapr2.brewingintel.com/mapred/nodeManager yarn
>
>
> What is interesting here is hadoopmapr7 is the nodemanager it's trying to
> start on, however the mount point it's trying to create is hadoopmapr2
> which is the node the resource manager happened to fall on... I was very
> confused by that because in no place should hadoopmapr2 be "known" to the
> nodemanager, because it thinks the resource manager hostname is
> myriad.marathon.mesos.
>
> So why was it hard coding to the node the resource manager is running on?
>
> Well if I look at the conf file in the sandbox (the file that gets copied
> to be yarn-site.xml for node managers. There ARE four references the
> hadoopmapr2. Three of the four say "source programatically" and one is just
> set... that's mapr.host. Could there be some down stream hinkyness going
> on with how MapR is setting hostnames? All of these variables seem "wrong"
> in that mapr.host (on the node manager) should be hadoopmapr7 in this case,
> and the resource managers should all be myriad.marathon.mesos. I'd be
> interested in your thoughts here, because I am stumped at how these are
> getting set.
>
>
>
>
> <property><name>yarn.resourcemanager.address</name><value>hadoopmapr2:8032</value><source>programatically</source></property>
> <property><name>mapr.host</name><value>hadoopmapr2.brewingintel.com
> </value></property>
>
> <property><name>yarn.resourcemanager.resource-tracker.address</name><value>hadoopmapr2:8031</value><source>programatically</source></property>
>
> <property><name>yarn.resourcemanager.admin.address</name><value>hadoopmapr2:8033</value><source>programatically</source></property>
>
>
>
>
>
> On Tue, Nov 17, 2015 at 2:51 PM, Darin Johnson <[email protected]>
> wrote:
>
>> Yuliya: Are you referencing yarn.nodemanager.hostname or a mapr specific
>> option?
>>
>> I'm working right now on passing a
>> -Dyarn.nodemanager.hostname=offer.getHostName(). Useful if you've got
>> extra ip's for a san or management network.
>>
>> John: Yeah the permissions on the tarball are a pain to get right. I'm
>> working on Docker Support and a build script for the tarball, which should
>> make things easier. Also, to the point of using world writable
>> directories
>> it's a little scary from the security side of things to allow executables
>> to run there, especially things running as privileged users. Many
>> distro's
>> of linux will mount /tmp noexec.
>>
>> Darin
>>
>> On Tue, Nov 17, 2015 at 2:53 PM, yuliya Feldman
>> <[email protected]
>> > wrote:
>>
>> > Please change workdir directory for mesos slave to one that is not /tmp
>> > and make sure that dir is owned by root.
>> > There is one more caveat with binary distro and MapR - in Myriad code
>> for
>> > binary distro configuration is copied from RM to NMs - it doe snot work
>> for
>> > MapR since we need hostname (yes for the sake of local volumes) to be
>> > unique.
>> > MapR will have Myriad release to handle this situation.
>> > From: John Omernik <[email protected]>
>> > To: [email protected]
>> > Sent: Tuesday, November 17, 2015 11:37 AM
>> > Subject: Re: Struggling with Permissions
>> >
>> > Oh hey, I found a post by me back on Sept 9. I looked at the Jiras and
>> > followed the instructions with the same errors. At this point do I still
>> > need to have a place where the entire path is owned by root? That seems
>> > like a an odd requirement (a changed of each node to facilitate a
>> > framework)
>> >
>> >
>> >
>> >
>> >
>> > On Tue, Nov 17, 2015 at 1:25 PM, John Omernik <[email protected]> wrote:
>> >
>> > > Hey all, I am struggling with permissions on myriad, trying to get the
>> > > right permissions in the tgz as well as who to run as. I am running
>> in
>> > > MapR, which means I need to run as mapr or root (otherwise my volume
>> > > creation scripts will fail on MapR, MapR folks, we should talk more
>> about
>> > > those scripts)
>> > >
>> > > But back to the code, I've had lots issues. When I run the
>> Frameworkuser
>> > > and Superuser as mapr, it unpacks everything as MapR and I get a
>> > > "/bin/container-executor" must be owned by root but is owned by 700
>> (my
>> > > mapr UID).
>> > >
>> > > So now I am running as root, and I am getting the error below as it
>> > > relates to /tmp. I am not sure which /tmp this refers to. the /tmp
>> that
>> > my
>> > > slave is executing in? (i.e. my local mesos agent /tmp directory) or
>> my
>> > > MaprFS /tmp directory (both of which are world writable, as /tmp
>> > typically
>> > > is... or am I mistaken here?)
>> > >
>> > > Any thoughts on how to get this to resolve? This is when nodemanager
>> is
>> > > trying to start running as root and root for both of my Myriad users.
>> > >
>> > > Thanks!
>> > >
>> > >
>> > > Caused by: ExitCodeException exitCode=24: File /tmp must not be world
>> or
>> > group writable, but is 1777
>> > >
>> > >
>> > >
>> > >
>> >
>> >
>> >
>> >
>>
>
>