Re: Myriad, MFS and CLDB

yuliya Feldman Sat, 31 Oct 2015 08:36:30 -0700

Hello Andre,
I think pretty much all the features you described you need for your usecase 
are at least in thoughts if not at works.
1. Yes - it is preferable to have to run Mapr-FS on bare metal and today with 
Mapr distro we have maprfs running in parallel with Mesos not on top of Mesos. 
2. There definitely won't be a case where you have to run maprfs inside and 
outside of a container - it will be just one maprfs outside
3. Having local volume on all maprfs enabled nodes still will be there, but you 
will be able to have compute somewhere else. 
Thanks,Yuliya
      From: Andre <[email protected]>
 To: [email protected] 
 Sent: Saturday, October 31, 2015 4:48 AM
 Subject: Re: Myriad, MFS and CLDB
   
>>How would a MapR NM container connect back to Yarn  without running
>>MFS inside the container?
>
> Currently NM doesn't run inside a (docker) container (both with and
> without Myriad).
>
> But the plan is to get there in the future to support multi tenancy.
> If you can, please share your story on why you'd want to run NM
> inside a container. It helps to work together and work on the right
> problems.


I would say that in our case, the desired outcome would be having the
ability to for example, run MapR-FS on baremetal, getting all non-OS
spindles of a host and using it as the backing of a highly available
file-store, without having to worry with yet another scale out system.

We think of it in terms of being able to run Nifi / ElasticSearch /
Kafka / <your_favourtite_apache_project> within a container, where
persistent files are backed by MapR-FS and resource allocation
coordinated not by specific platform but by application agnostic
framework.

It reflects our experience (as probably many others noticed before)
that a good amount of current scale out systems have very similar
requirements

http://doc.mapr.com/display/MapR/Planning+Cluster+Hardware
https://www.elastic.co/guide/en/elasticsearch/guide/current/hardware.html
http://docs.confluent.io/1.0/kafka/deployment.html
(the Kafka and ElasticSearch pages go to the length of having some
good chunks of identical text... )


This mean some of this kit is so similar to the technology next door
that you could simply reallocate them around, but this is messy and I
rather use ephemeral managed computing resources.



Why running inside of a container? Basic hygiene.

We have users that use Python UDFs in their Hive queries and as a
mater of principle I would love to be able to prevent them from
running code (albeit indirectly) on baremetal.


>> If that is not the case, this will have some interesting consequences
>> on MapR users adopting Myriad.
>
> Can you please elaborate more on this? Currently Myriad doesn't launch
> NMs inside docker containers. Myriad launches NMs via Mesos, but as
> physical processes.

I meant that running MapR-FS both on baremetal and within a container
is not only a technical waste but a financial waste as well.

If that (running MFS twice) ends up being the choice and MapR
continues to license as it does now, I would end up having to license
1 + N  MapR nodes (where N is the number of containers reaching the
CLDB).

However based on your email it seems to me this is not an issue yet,
as running NM within a container still unsupported and working just
through hacks.



> I think a cleaner solution would be to not enforce running MFS inside the
> containers. Trying to do that has other consequences, esp. around the need
> to pass the host's disks into docker containers for MFS to format.
>
> The right way to do this is to modify createTTVolume.sh to be container
> aware.

100% agree.


Thank you for the replies, it seems like it will be: Mesos here I go. :-)

Cheers

Re: Myriad, MFS and CLDB

Reply via email to