Hi Renan, Since you did a similar exercise using Go [1], it will be nice to see your feedback and guidance on the discussions Gourav is summarizing below.
Suresh [1] - http://markmail.org/thread/ymj7yqvvbhrjwv3s <http://markmail.org/thread/ymj7yqvvbhrjwv3s> > On Oct 17, 2016, at 11:32 PM, Shenoy, Gourav Ganesh <[email protected]> > wrote: > > Hi dev, > > Now that I have been able to get jobs scheduled via Aurora, I thought I > should summarize my understanding. I would also like to briefly draw out the > plan which I am working on with respect to using Mesos with Airavata. > > Apache Aurora: > > · Aurora, similar to Marathon & Chronos, is a service scheduler > framework for Mesos. It has been built for scheduling long running services & > cron jobs on Mesos. > · The advantage with Aurora (over Marathon & Chronos) is that it > works well for one-off jobs as well – i.e. If I want to run a job and get the > output, Aurora is a better fit than Marathon & Chronos, since Marathon will > never let the job exit (and keep restarting it on slaves) & Chronos is ONLY > for crons. > · Aurora also allows fine grained control of the jobs that need to be > submitted – the concept of jobs, tasks, processes – a job can consist of one > or more tasks, and a task can consist of one or more processes. > · Aurora manages jobs that are made up of tasks; Mesos manages the > tasks that consist of processes; Thermos (is the Aurora executor) manages the > processes. > · We can control resource utilization at task level because of the > above job abstractions that Aurora provides. > · Among many other features, a useful one is the resource-quota > management for users & the ability to support multiple users to run jobs. > > Current focus: > > · I am currently working on building a Thrift based client for > Aurora, and have been successful in implementing one, but with limited > operations. > · I will be adding support for more operations keeping them aligned > to Airavata job submission/monitoring requirements. > · I am currently focusing on targeting Airavata deployment to Mesos > on a single cluster (eg: AWS). The flow would look like follows: > <image001.png> > · As you can see, currently there is just a single Mesos cluster. The > future focus would be to expand this to have multiple clusters. > > Subsequent work: > · Once we are able to test Airavata deployment to single cluster > successfully, we can expand this to a multi-cluster environment. > · Here we would multiple Mesos clusters which would somehow need to > be managed. But, the overall flow would look like follows: > <image002.png> > > · We can either have multiple Mesos masters (for each individual > cluster), that are connected to each other via VPN, or have a single master – > in which case we would need to consider all other nodes as slaves. > · This is a design issue which needs discussion, and Suresh has some > ideas on how to do this. > > Thanks and Regards, > Gourav Shenoy > > From: Suresh Marru <[email protected]> > Reply-To: "[email protected]" <[email protected]> > Date: Friday, October 7, 2016 at 11:43 PM > To: Airavata Dev <[email protected]> > Subject: Re: Mesos based meta-scheduling for Airavata > > Hi Gourav, > > Thank you for the nice informative summaries, posts like these are always > educational. Keep’em coming. > > Suresh > > On Oct 7, 2016, at 10:56 PM, Shenoy, Gourav Ganesh <[email protected] > <mailto:[email protected]>> wrote: > > Hi dev, > > I have been exploring different frameworks for Mesos which would help our > use-case of providing Airavata the capability to run jobs in a Mesos based > ecosystem. In particular, I have been playing around with Marathon & Chronos > and I am now going to be working on Apache Aurora. > > I have summarized my understanding about Mesos, Marathon & Chronos below. I > will send out a separate email about Aurora later. > > Apache Mesos: > > · Apache Mesos is an open-source cluster manager, in the sense that > it helps deploy & manage different frameworks (or applications) in a large > clustered environment easily. > · Mesos provides the ability to utilize underlying shared pool of > nodes as a single compute unit – That is, it can run many applications on > these nodes efficiently. > · Mesos uses the concept of “offers” for scheduling and running jobs > on the underlying nodes. When a framework (application) wants to run > computations/jobs on the cluster, Mesos will decide how many resources it > will “offer” that framework based on the availability. The framework will > then decide which resources to use from the offer, and subsequently run the > computation/job on that resource. > · In a typical cluster, you will have 3 or more Mesos masters & > multiple Mesos slaves. Multiple mesos masters help in providing high > availability – if one master goes down, Mesos will reelect a new leader > (master) – using Zookeeper. > · The task mentioned above of providing “offers” to frameworks is > done by a master, whereas the slaves are the ones who run these computations. > > · Some additional points: > o I built a Mesos cluster with 3 masters & 2 slaves on EC2. > o Each master & slave have 1GB of RAM & 1vCPU with 20GB of disk space. > > Marathon: > > · Marathon is considered a framework that runs on top of Mesos. It is > a container orchestration platform for Mesos and essentially acts as a > service scheduler. > · It is named “marathon” because it is intended for long running > applications. That is, Marathon makes sure that the service it is running > never stops – if a service goes down or the slave on which the service is run > dies, marathon keeps re-starting it on different slaves. > · In some sense Marathon is very good for ensuring high availability > of services. That is, instead of running services directly on Mesos, run it > in Marathon if you never want it to die. > Note: You can decide to run a service on multiple slave nodes and if > resources on these slaves are available, Mesos will “offer” them to Marathon. > · It is called a container orchestration platform because it > “launches” these services inside a container – either Docker OR Mesos > container. > · In my opinion it is not a suitable “job scheduler” for Airavata > because in Airavata we need to run a job and get the output rather than > keeping it running always. Instead, we can run other schedulers – > chronos/aurora as a service in Marathon. > > > Chronos: > > · Chronos is a Cron scheduler for Mesos. It is good for running > scheduled jobs – jobs that need to be run for a certain number of times, > repeatedly after certain intervals. > · Chronos also provides the ability to add dependencies between jobs > – That is, if a job1 is dependent on another job2 then it will run job1 first > and then run job2 after job1 completes. It also builds a Directed Acyclic > Graph (DAG) based on these dependencies. > · Similar to Marathon, Chronos receives “offers” from Mesos master > whenever it needs to run a job on Mesos. > · Again, I found that Chronos does not fit the Airavata use-case > since I could not find a way to run one-off jobs via Chronos – you need to > specify interval time for Chronos, & Chronos then re-runs the job after that > interval is complete (even if you decide to specify num. of repetitions=1). > > > Some additional points: > · Marathon & Chronos both have REST API support – eg: you can submit > jobs via APIs along with other interactions such as list jobs, etc. > · I installed Marathon & Chronos frameworks on the Mesos master > nodes. This is how their health looks like on the Mesos dashboard: > > <image002.png> > As you can see, there are 3 active tasks running in Chronos & > 4 active tasks (long running) in Marathon. > > · I also installed Chronos as a service inside Marathon, and this is > how it looks like in the Marathon UI: > > > <image004.png> > Interestingly, Chronos (as a service in Marathon) was smart enough to > identify the jobs submitted via Chronos (as a framework on Mesos) & > vice-versa. > > · Also, Mesos dashboard lists the active tasks it is running & > details about which slave the task is running on. It also lists Completed > tasks. The “Sandbox” gives you access to the stdout/stderr files for the > tasks as well as any other directories that were created as part of the task. > > > <image005.png> > > Pardon me for this long email. Next, I will explore Apache Aurora which seems > a better fit for Airavata use-case because it provides the features that > Chronos supports, as well as can run one-off jobs. > > Thanks and Regards, > Gourav Shenoy > > From: "Shenoy, Gourav Ganesh" <[email protected] > <mailto:[email protected]>> > Reply-To: "[email protected] <mailto:[email protected]>" > <[email protected] <mailto:[email protected]>> > Date: Friday, September 23, 2016 at 4:43 PM > To: "[email protected] <mailto:[email protected]>" > <[email protected] <mailto:[email protected]>> > Subject: Mesos based meta-scheduling for Airavata > > Hi Dev, > > I am working on this project of building a Mesos based meta-scheduler for > Airavata, along with Shameera & Mangirish. Here is the jira > link:https://issues.apache.org/jira/browse/AIRAVATA-2082 > <https://issues.apache.org/jira/browse/AIRAVATA-2082>. > > · We have identified some tasks that would be needed for achieving > this, and at the higher level it would consist of: > 1. Resource provisioning – We need to provision resources on cloud & hpc > infrastructures such as EC2, Jetstream, Comet, etc. > 2. Building a cluster – Deploying a Mesos cluster on set of nodes > obtained from (1) above for task management. > 3. Selecting a scheduler – We need to investigate the scheduler to use > with Mesos cluster. Some of the options are Marathon, Aurora. But we need to > find one that suits our needs of running serial as well as parallel (MPI) > jobs. > 4. Installing & running applications on this cluster – Once the cluster > has been deployed and a scheduler choice made, we need to be able to install > and run applications on this cluster using Airavata. > > · Until now we were able to look into the following: > o Resource provisioning: > § We explored several options of provisioning resources – using cloud > libraries as well as via ansible scripts. > § We built a OpenStack4J Java module which would provision instances on > OpenStack based clouds (eg: Jetstream). > § We also built a CloudBridge Python module for provisioning EC2 instances > on Amazon. CloudBridge can also be used to provision instances on OpenStack > § We wrote Ansible scripts for bringing up instances on both AWS and > OpenStack based clouds. > > § Key Points: CloudBridge, OpenStack4J are powerful libraries for resource > provisioning, but currently they do single-instance provisioning, and not > support templated boot options such as CloudFormation (for AWS) & Heat (for > OpenStack). > > o Building a cluster: > § We wrote Ansible script for deploying a Mesos-Marathon cluster on a set of > nodes. This script will install necessary dependencies such as Zookeeper. > § We tested this on OpenStack based clouds & on EC2. > § OpenStack Magnum provides excellent support for doing resource > provisioning & deploying mesos cluster, but we are running into some problems > while trying it. > > o Installing a scheduler: > § Our Ansible script is currently installing Marathon as the scheduler on > Mesos. We haven’t yet submitted jobs using Marathon. > > · Although not finalized, but we are inclined towards using Ansible > approach for the above, as Ansible also provides Python APIs and which will > allow us to integrate it with Airavata via Thrift. Hence we will be able to > easily invoke the Ansible scripts from code without needing to use the > command-line interface. > > · We are also progressively working on some work-items such as: > o Exploring options to provision and deploy a Mesos-Marathon cluster on HPC > systems such as Comet. The challenge would be to use Ansible to provision > resources and deploy the cluster. Once we have a cluster, we can try running > applications. > o Exploring different scheduler options for running serial and parallel > (MPI) jobs on such heterogeneous clusters. > o Exploring orchestration options such as OpenStack Heat, AWS > CloudFormation, OpenStack Magnum, etc. > > Any suggestions and comments are highly appreciated. > > Thanks and Regards, > Gourav Shenoy >
