>Apologies for the unusual post but in the absence of a user list I am
>sort of left without other options. :-)

No worries. We hope to create a user@ soon after we see an increase
in emails like this to dev@. So, welcome and keep them coming!

>I have been keeping an eye on Myriad as we run a number of Hadoop
>nodes next to a bunch of some friendly use-cases and currently
>struggle to manage resourcing boundaries.

Glad to hear (again) that this is precisely the problem Myriad had set out
to solve.

>How would a MapR NM container connect back to Yarn  without running
>MFS inside the container?

Currently NM doesn't run inside a (docker) container (both with and without
Myriad).
But the plan is to get there in the future to support multi tenancy. If you
can, please share
your story on why you'd want to run NM inside a container. It helps to work
together
and work on the right problems.

In this context, I'd also like to inform you that Swapnil, Sarjeet
and Mitra from MapR started working on this problem (and won1st prize)
at Docker Hack Day event. Check out:
https://blog.docker.com/2015/09/docker-global-hack-day-3-winners/


> Failure to do so (using the default config) will usually (always?)
>result a failure of the NM to start.

Always. You are spot on.

> I imagine this behaviour may be worked around but I could not find any
documentation about it.

Currently MapR doesn't support running NM inside containers.

The createTTVolume.sh script is coded to be invoked on the same host as the
MFS.
So, when the NM runs inside the container, the hostname of the container
doesn't match with the hostname
of the physical host (unless you use docker's HOST networking mode) and
causes the script to fail.

Recently, a couple of us did an experiment with it. We modified
the script to pick up the hostname from a env variable (as opposed to
`hostname --fqdn`)
and passed the host's hostname to the env variable while launching the
container. With that
we were able to run NM inside the container and have it create a local
volume on the MFS
running on the host.

Still, these are just experiments and a lot more thought needs to be given
to make it
work in a containerized environment. For e.g. what about running multiple
such
NM containers per host?

> If that is not the case, this will have some interesting consequences
> on MapR users adopting Myriad.

Can you please elaborate more on this? Currently Myriad doesn't launch
NMs inside docker containers. Myriad launches NMs via Mesos, but as
physical processes.

>So far so good. But when you take this with the
>com.mapr.hadoop.mapred.LocalVolumeAuxService behaviour mentioned
>above, one
>would end up having to run MFS on both the baremetal slave and within
>the container it is hosting!

I think a cleaner solution would be to not enforce running MFS inside the
containers. Trying to do that has other consequences, esp. around the need
to pass
the host's disks into docker containers for MFS to format.

The right way to do this is to modify createTTVolume.sh to be container
aware.

On Fri, Oct 30, 2015 at 6:45 AM, Andre <[email protected]> wrote:

> Hi there,
>
> Apologies for the unusual post but in the absence of a user list I am
> sort of left without other options. :-)
>
> I have been keeping an eye on Myriad as we run a number of Hadoop
> nodes next to a bunch of some Mesos friendly use-cases and currently
> struggle to manage resourcing boundaries.
>
>
> I have asked this question to a number of different individuals at
> MapR but never managed to get a clear answer, therefore apologies for
> the somehow MapR focused question but....
>
> How would a MapR NM container connect back to Yarn  without running
> MFS inside the container?
>
> Reason I ask is simple:
>
> By default MapR NM nodes depend on MFS in order to create a local
> volume. This applies also to compute nodes and is explicitly
> documented behaviour and seen on the logs on entries similar to the
> following:
>
>
> 2015-09-10 11:11:13,794 INFO
> com.mapr.hadoop.mapred.LocalVolumeAuxService: Checking for local
> volume. If volume is not present command will create and mount it.
>
>
> Failure to do so (using the default config) will usually (always?)
> result a failure of the NM to start.
>
> I imagine this behaviour may be worked around but I could not find any
> documentation about it.
>
>
>
> If that is not the case, this will have some interesting consequences
> on MapR users adopting Myriad.
>
> Taking for example users opting for following an MFS backed Mesos
> topology like the one described in here:
>
> https://www.mapr.com/blog/my-experience-running-docker-containers-on-mesos
> https://www.mapr.com/solutions/zeta-enterprise-architecture
>
> My understanding is that a good number of baremetal servers would be
> running MFS and ensuring that resources (disk controllers and disk
> slots in this case) are utilised to their maximum potential.
>
> So far so good. But when you take this with the
> com.mapr.hadoop.mapred.LocalVolumeAuxService behaviour mentioned
> above, one
> would end up having to run MFS on both the baremetal slave and within
> the container it is hosting!
>
> This raises a few interesting questions:
>
> * How to deal with MFS service and its memory gluttony?
>        Caching would potentially be happening on both baremetal and
> containers...
>
> * Both barebone and contained would have to contact the CLDB in order
> to function
>        well... Lets say this would likely cause MapR share prices
> would go up. :-)
>
>
>
> Keen to understand from the MapR commiters what is the view / work
> around createTTvolume.sh
>
> Kind regards
>
> Andre
>

Reply via email to