The Application Factory component is called “gfac” in the code base.  This is 
the part that handles the interfacing to the remote resource (most often by ssh 
but other providers exist). The Orchestrator routes jobs to GFAC instances.

From: Mangirish Wagle 
<[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Wednesday, March 23, 2016 at 11:56 AM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Re: [GSOC Proposal] Cloud based clusters for Apache Airavata

Hello Team,

I was drafting the GSOC proposal and I just had a quick question about the 
integration of the project with Apache Airavata.

Which is the component in Airavata that would call the service to provision the 
cloud cluster?

I am looking at the Airavata architecture diagram and my understanding is that 
this would be treated as a new Application and would have a separate 
application interface in 'Application Factory' component. Also the workflow 
orchestrator would be having the intelligence to figure out which jobs to be 
submitted to cloud based clusters.

Please let me know whether my understanding is correct.

Thank you.

Best Regards,
Mangirish Wagle

On Tue, Mar 22, 2016 at 2:28 PM, Pierce, Marlon 
<[email protected]<mailto:[email protected]>> wrote:
Hi Mangirish, please add your proposal to the GSOC 2016 site.

From: Mangirish Wagle 
<[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Thursday, March 17, 2016 at 3:35 PM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: [GSOC Proposal] Cloud based clusters for Apache Airavata

Hello Dev Team,

I had the opportunity to interact with Suresh and Shameera wherein we discussed 
an open requirement in Airavata to be addressed. The requirement is to expand 
the capabilities of Apache Airavata to submit jobs to cloud based clusters in 
addition to HPC/ HTC clusters.

The idea is to dynamically provision a cloud cluster in an environment like 
Jetstream, based on the configuration figured out by Airavata, which would be 
operated by a distributed system management software like Mesos. An initial 
high level goals would be:-

  1.  Airavata categorizes certain jobs to be run on cloud based clusters and 
figure out the required hardware config for the cluster.
  2.  The proposed service would provision the cluster with the required 
resources.
  3.  An ansible script would configure a Mesos cluster with the resources 
provisioned.
  4.  Airavata submits the job to the Mesos cluster.
  5.  Mesos then figures out the efficient resource allocation within the 
cluster and runs the job and fetches the result.
  6.  The cluster is then deprovisioned automatically when not in use.

The project would mainly focus on point 2 and 6 above.

To start with, I am currently trying to get a working prototype of setting up 
compute nodes on an openstack environment using JClouds (Targetted for 
Jetstream). Also, I am planning to explore the option of using Openstack Heat 
engine to orchestrate the cluster. However, going ahead Airavata would be 
supporting other clouds like Amazon EC2 or Comet cluster, so we need to have a 
generic solution for achieving the goal.

Another approach which might be efficient in terms of performance and time is 
using a container based clouds using Docker, Kubernetes which would have 
substantially less bootstrap time compared to cloud VMs. This would be a future 
prospect as we may not have all the clusters supporting containerization.

This has been considered as a potential GSOC project and I would be working on 
drafting a proposal on this idea.

Any inputs/ comments/ suggestions would be very helpful.

Best Regards,
Mangirish Wagle

Reply via email to