Re: A Myriad Story

Santosh Marella Wed, 02 Dec 2015 11:54:08 -0800

>...the zero profiles still allocated containers even though the setting
was 1 and
> the NM had < 1 available vcores.   So I am still trying to figure out why
> this worked in this use case so I am clear on the setting.


The myriad scheduler running inside the RM dynamically changes the RM's
perspective about a FGS NM's capacity, based on the offers received from
Mesos.

When Mesos offers some resources to Myriad from a slave node running a FGS
NM,
the Myriad Scheduler projects these resources as additional capacity
available on the FGS NM at that moment. Thus, RM's YARN scheduler believes
the FGS
NM has non-zero capacity (at that moment) and schedules containers.

If Mesos didn't offer any resources from a FGS slave, then that means FGS
NM will have 0 vcores and no containers will get allocated to that NM.

Santosh

On Wed, Dec 2, 2015 at 11:32 AM, John Omernik <[email protected]> wrote:

> So I am still confused by the FGS/mininum-allocation-vcores.
>
> >> This also implicitly means that YARN cannot allocate a container unless
> at least "min allocation vcores" are available on a NM. If a NM has less
> than
> "miin allocation vcores", that NM will not get any containers allocated to
> it, unless existing
> containers finish (in case of plain YARN and with Myriad CGS) or
> unless Mesos offers more resources (in case of FGS).
>
> I actually found that I had the setting at 1, and I had 1 CGS NM running,
> and 4 FGS (zero profile) NMs running, and when it allocated containers, the
> zero profiles still allocated containers even though the setting was 1 and
> the NM had < 1 available vcores.   So I am still trying to figure out why
> this worked in this use case so I am clear on the setting.
>
> John
>
>
>
> On Mon, Nov 30, 2015 at 2:16 PM, Santosh Marella <[email protected]>
> wrote:
>
> > Thanks for trying out these experiments.
> >
> > >I thought this would have
> > >broken FGS, but apparently it didn't (I started my nodes with min
> > >allocation CPU = 1 and FGS still worked for me... not sure about that,
> > >would love feedback there)
> >
> > I presume you are referring to
> "yarn.scheduler.minimum-allocation-vcores".
> >
> > The behavior of this variable is that when an app requests for a
> container
> > with
> > less than the "min allocation vcores", then RM ends up allocating a
> > container
> > with "min allocation vcores". YARN's default value for this is 1 - i.e.
> if
> > an app wants
> > a container with < 1 cpu vcores, YARN allocates a container with 1 CPU
> > vcore.
> >
> > This also implicitly means that YARN cannot allocate a container unless
> > at least "min allocation vcores" are available on a NM. If a NM has less
> > than
> > "miin allocation vcores", that NM will not get any containers allocated
> to
> > it, unless existing
> > containers finish (in case of plain YARN and with Myriad CGS) or
> > unless Mesos offers more resources (in case of FGS).
> >
> > In that sense, FGS/CGS do not interfere with "min allocation vcores".
> > Rather
> > FGS and CGS just influence the NM capacities.
> >
> > Hope this helps.
> >
> > Thanks,
> > Santosh
> >
> > On Tue, Nov 24, 2015 at 9:25 AM, John Omernik <[email protected]> wrote:
> >
> > > Since a vast majority of my posts are me struggling with something or
> > > breaking something, I thought I'd take the time to craft a story of
> > Myriad
> > > success.
> > >
> > > Over the weekend, I took the time to run Elastic Search on Yarn using
> the
> > > es-yarn package that elastic search has. This is a beta package, and
> > > struggles with some components such as "Storage" for the data.
> > >
> > > With MapR I've been able to create a place and some scripts to manage
> the
> > > data issue. This combined with ES2's dynamic node allocation, and the
> > > Myriad's fine grain scaling made it so I have a powerful way to
> > elasticize
> > > elastic search!
> > >
> > > Basically, I did this through some simple steps.  The first was to take
> > the
> > > "include" file (esinc.sh) from the distribution and add some items to
> it
> > > and tell the es-yarn framework to use this instead of the included
> file.
> > > This allowed me to set some parameters at start time with environmental
> > > variables.
> > >
> > > Simple steps.
> > >
> > > Download the es-yarn packages and the elaticsearch zip.
> > >
> > > (optional: I added https://github.com/royrusso/elasticsearch-HQ as a
> > > plugin, basically I unzipped the ES2 zip, ran the plugin script, and
> the
> > > rezipped the package)
> > >
> > > In a location in MapR I copied the esinc.sh to a location out of the
> zip
> > > (in the root of say my working directory,
> > > /mapr/mycluster/mesos/dev/es-yarn/). (see below) Then I created a
> script
> > > that was how I started and scaled up the clusters. (see below). I do
> have
> > > some notes on how it works in the script.
> > >
> > > For basics this was awesome. I didn't use FGS at first, because of a
> bug
> > in
> > > the es-yarn
> > > https://github.com/elastic/elasticsearch-hadoop/issues/598 if the min
> > > allocation cpu is 0 there is a divide by 0. I thought this would have
> > > broken FGS, but apparently it didn't (I started my nodes with min
> > > allocation CPU = 1 and FGS still worked for me... not sure about that,
> > > would love feedback there)
> > >
> > > When I "start" a cluster, I initialize it and eventually my cluster is
> > > running with the specified es nodes. If I want to scale up, I run the
> > > script again with the same cluster name. Those nodes are added to the
> ES
> > > cluster, and now I am running two yarn applications. I can only scale
> > down
> > > by applications. So, if I want to scale down, then I have to kill an
> > > application, if I started 3 ES nodes with that application, I'll scale
> > down
> > > by 3 nodes. Thus, there is an argument to always scale by one ES node,
> > > especially if you are using larger nodes (I wonder what sort of
> > Application
> > > manager overhead I'd get on that).
> > >
> > > Either way, this worked really well.
> > >
> > > The cool thing was with FGS though. I had one node running in a "small"
> > > config and 4 running with zero (even though I set the min allocation
> size
> > > to 1, this still started and seemed to work).  When I submitted the
> > request
> > > for ES nodes, they got put into mesos tasks for each container and it
> > > worked great.  When I scaled the application down it too worked great.
> > > This provided me huge flexibility in scaling up and down without
> > reserving
> > > resources for elastic search clusters.  Kudos to Myriad!!!!!
> > >
> > >
> > > My only comment to the Myriad crew would be a wiki article explaining
> > FGS a
> > > little bit. I just "did it" and it worked, but a little bit more on how
> > it
> > > worked, the challenges, gotchas etc.  Would be outstanding.
> > >
> > > Thanks to everyone's hard work on Myriad, this project has lots of
> power
> > > that it can give Admins/users, and I just wanted to share a win here
> > after
> > > all of my "how do I ... " posts
> > >
> > > John
> > >
> > >
> > >
> > >
> > > startcluster.sh
> > >
> > > #!/bin/bash
> > >
> > >
> > >
> > >
> > >
> > > #Perhaps these should be parameters or at least check to see if they
> are
> > > set in the ENV, if not use these?
> > >
> > >
> > >
> > > ES_JAR="elasticsearch-yarn-2.2.0-beta1.jar"  # Jarfile of es-yarn to
> use
> > >
> > > ES_NFSMOUNT="/mapr/mycluster"     # Root of NFS Mount in MapR
> > >
> > > ES_BASELOC="/mesos/dev/es-yarn"       # Location of this script, the
> > > esinc.sh, and the basis for all things es-yarn
> > >
> > > ES_DATALOC="/data"                           # This is the data
> location,
> > > which is $ES_BASELOC$ES_DATALOC  it creates directories under that for
> > each
> > > clustername, then each node
> > >
> > > ES_PROVISION_LOC="/tmp/esprovision/"         # Where the jar file for
> > > es-yarn, and the elastic search zip file is
> > >
> > >
> > >
> > >  # These are your node names. It needs these to do it's unicast
> > discovery.
> > > This may need to be updated (perhaps I can curl the Mesos Master to get
> > the
> > > node list)
> > >
> > > ES_UNICAST_HOSTS="node1.mydomain.com,node2.mydomain.com,
> > node3.mydomain.com
> > > "
> > >
> > >
> > >
> > >
> > > # The elastic search version you are running (the es-yarn jar uses this
> > to
> > > pick the right zip file, make sure you have not changed the name of the
> > ES
> > > Zip file)
> > >
> > > ES_VER="2.0.0"
> > >
> > >
> > >
> > >
> > > # Cluster settings.  Name and Port
> > >
> > >
> > > ES_CLUSTERNAME="MYESCLUSTER"
> > >
> > > ES_TRANSPORT_PORT="59300-59400"
> > >
> > > ES_HTTP_PORT="59200-59300"
> > >
> > >
> > >
> > > # For this run, the number of nodes to add in "this" application. (Each
> > > submission is a yarn application) and the node size
> > >
> > > NUM_NODES="3"
> > >
> > > NODEMEM="2048"
> > >
> > > NODECPU="2"
> > >
> > >
> > >
> > >
> > > # Don't change anything else here:
> > >
> > >
> > >
> > > ES_INCLUDE="${ES_NFSMOUNT}${ES_BASELOC}/esinc.sh"
> > >
> > > ES_YARN_JAR="${ES_NFSMOUNT}${ES_BASELOC}/${ES_JAR}"
> > >
> > >
> > >
> > >
> > >
> > > ES_ENV="env.ES_CLUSTERNAME=$ES_CLUSTERNAME
> > > env.ES_TRANSPORT_PORT=$ES_TRANSPORT_PORT env.ES_HTTP_PORT=$ES_HTTP_PORT
> > > env.ES_UNICAST_HOSTS=$ES_UNICAST_HOSTS env.ES_NFSMOUNT=$ES_NFSMOUNT
> > > env.ES_BASELOC=$ES_BASELOC env.ES_DATALOC=$ES_DATALOC"
> > >
> > >
> > >
> > > RUNCMD="hadoop jar $ES_YARN_JAR -start containers=$NUM_NODES
> > > hdfs.upload.dir=$ES_PROVISION_LOC es.version=$ES_VER
> > container.mem=$NODEMEM
> > > container.vcores=$NODECPU env.ES_INCLUDE=$ES_INCLUDE $ES_ENV"
> > >
> > >
> > >
> > >
> > >
> > > echo "Starting Cluster.... "
> > >
> > >
> > >
> > > $RUNCMD
> > >
> > >
> > >
> > > echo "Application Submitted, check logs for http endpoints (perhaps we
> > > should monitor the logs until we get one and then display it as
> > > http://IP:port/_plugin/hq";
> > >
> > >
> > >
> > >
> > > Append to your customized esinc.sh:
> > >
> > > ###########################################################
> > >
> > > echo "Start Customization of esinc for yarn/myriad/mapr"
> > >
> > >
> > >
> > > export ES_NODE_NAME="$CONTAINER_ID"    #A Unique value
> > >
> > > export ES_NETWORK_HOST="_non_loopback_"
> > >
> > > export
> > >
> > >
> >
> ES_PATH_LOGS="${ES_NFSMOUNT}${ES_BASELOC}${ES_DATALOC}/$ES_CLUSTERNAME/$ES_NODE_NAME/logs"
> > >
> > > export
> > >
> > >
> >
> ES_PATH_DATA="${ES_NFSMOUNT}${ES_BASELOC}${ES_DATALOC}/$ES_CLUSTERNAME/$ES_NODE_NAME/data"
> > >
> > >
> > >
> > >
> > >
> > > # These are set here, eventually they should be set by the framework
> and
> > > these just grab the value from above at the framework level
> > >
> > >
> > >
> > > #Set log level if needed
> > >
> > > ES_DEBUG_OPTS="-Des.logger.level=DEBUG"
> > >
> > >
> > >
> > > # Cluster Name and Node Name - Think through Nodename here more...
> > Perhaps
> > > we could generate a name that if it failed and tried to restart would
> > have
> > > the same node name...
> > >
> > > ES_NAME_OPTS="-Des.cluster.name <http://des.cluster.name/
> > >=$ES_CLUSTERNAME
> > > -
> > > Des.node.name <http://des.node.name/>=$ES_NODE_NAME"
> > >
> > >
> > >
> > > # Path to Log Locations and Datafile locations Todo: Perhaps add labels
> > or
> > > create volumes with mapr rest api
> > >
> > > ES_PATH_OPTS="-Des.path.logs=$ES_PATH_LOGS
> -Des.path.data=$ES_PATH_DATA"
> > >
> > >
> > >
> > > #Networking options. Need to set the other nodes for discovery, the
> > network
> > > host so it's listening on the right interfaces, and the ports used for
> > > transport and http
> > >
> > >
> ES_NETWORK_OPTS="-Des.discovery.zen.ping.unicast.hosts=$ES_UNICAST_HOSTS
> > > -Des.network.host=$ES_NETWORK_HOST
> > > -Des.transport.tcp.port=$ES_TRANSPORT_PORT
> -Des.http.port=$ES_HTTP_PORT"
> > >
> > >
> > >
> > >
> > >
> > > export ES_JAVA_OPTS="$ES_DEBUG_OPTS $ES_NAME_OPTS $ES_PATH_OPTS
> > > $ES_NETWORK_OPTS"
> > >
> > >
> > >
> > > #This is just for debugging
> > >
> > > env
> > >
> > >
> > > #########################################
> > >
> >
>

Re: A Myriad Story

Reply via email to