Hi Gourav,

Please go ahead and submit a proposal draft through the GSOC 2016 web site. I 
personally recommend using the google doc option over posting the drafts to the 
Airavata wiki since I can make comments inline.

Thanks,

Marlon


From: Gourav Rattihalli 
<[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Monday, March 21, 2016 at 10:22 AM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: [GSoC Proposal] - Integrating Job and Cloud Health Information of 
Apache Aurora with Apache Airavata

Hi Dev Team,

Please review the following GSoC proposal that I plan to submit:

Title: Integrating Job and Cloud Health Information of Apache Aurora with 
Apache Airavata

Abstract:
This project will incorporate Apache Aurora to enable Airavata to launch jobs 
on large cloud environments, and collect the related information on the health 
of each job and the cloud resources. The project will also analyze the current 
micro-services architecture of Airavata and develop code for an updated 
architecture for modules such as Logging. As as result, another outcome of this 
project would be development of a module that will collect all the logging 
information from the various execution points in an Airavata job's lifecycle 
and provide search and mining capability.

Introduction:

Apache Aurora is a service scheduler, that runs on top of Apache Mesos. This 
combination enables the use of long running services that take advantage of 
Apache Mesos scalability, fault-tolerance and resource isolation. Apache Mesos 
is a cluster manager, which provides information about the state of the 
cluster. Aurora uses that knowledge to make scheduling decisions. For example, 
when a machine experiences failure Aurora automatically reschedules those 
previously-running services onto a healthy machine in order to keep them 
running. Each job is tracked by Aurora to be in one of the following states: 
pending, assigned, starting, running, and finished.


Apache Aurora requires a configuration file ”.aurora” to launch jobs. Following 
is an example of Aurora configuration file:


import os
hello_world_process = Process(name = 'hello_world', cmdline = 'echo hello 
world')

hello_world_task = Task(
 resources = Resources(cpu = 0.1, ram = 16 * MB, disk = 16 * MB),
 processes = [hello_world_process])

hello_world_job = Job(
 cluster = 'cluster1',
 role = os.getenv('USER'),
 task = hello_world_task)

jobs = [hello_world_job]


To launch the job with the above configuration we use:


aurora job create cluster1/$USER/test/hello_world hello_world.aurora


This project will develop modules in Airavata to automatically generate the 
Aurora configuration file to launch a job on an Aurora-managed cluster in a 
cloud environment. The Aurora user interface, as shown in the web portal 
displayed above, provides detailed information on the job status, job name, 
start and finish times, location of the logs, and resource usage. This project 
will use add a module to Apache Aurora to pull this detailed information using 
the the Aurora HTTP API.

Goals:

  *   This project will investigate how apache Aurora collects information of 
cluster environment for display on the Aurora web interface. We will study the 
Aurora HTTP API and retrieve all the information related to the target 
infrastructure and job health, and make it available to the Airavata job 
submission module.

  *   We will process the retrieved information from Aurora and convert the 
information in a format that can be used by Airavata for further action.

  *   We will use the appropriate design patterns to integrate the use of 
Aurora as one of the options for Big Data and Cloud resource frameworks with 
the Airavata framework

  *   We will make the resource information from Aurora available for display 
on the Airavata dashboard.

Any comment and suggestions would be very helpful.

-Gourav Rattihalli

Reply via email to