Hi Dimuthu,

Very good summary! I am not sure if you have, but DC/OS (DataCenter Operating 
System) is a container orchestration platform based on Apache Mesos. The beauty 
of DC/OS is the ease and simplicity of development/deployment; yet being 
extremely powerful in most of the parameters – multi-datacenter, multi-cloud, 
scalability, high availability, fault tolerance, load balancing, and more 
importantly the community support is fantastic.

DC/OS has an exhaustive service catalog, it’s more like a PAAS for containers 
(not just restricted to containers though) – you can run services like Spark, 
Kafka, RabbitMQ, etc out of the box with a single click install. And Apache 
Mesos as the underlying resource manager makes it seamless to deploy 
applications across different datacenters. There is a concept of SERVICE vs JOB 
– service is considered long running and DC/OS will make sure it keeps it 
running (if a service fails, it spins up a new one), whereas jobs are one time 
executors. This comes handy for using DC/OS as a target runtime for Airavata.

We used DC/OS for our class project to run the distributed task execution 
prototype we built (which uses RabbitMQ messaging). Here’s a link to the blog I 
have explaining the process: 
https://gouravshenoy.github.io/apache-airavata/spring17/2017/04/20/final-report.html
 . I have also attached a PDF paper we wrote as part of the class explaining 
the task execution process and one solution using rabbitmq messaging.

I had also started with the work of containerizing Airavata and a unified build 
+ deployment mechanism with CI CD on DC/OS. Unfortunately, I couldn’t complete 
it due to time constraints, but I would be more than happy to work with you on 
this. Let me know and we can coordinate.

Thanks and Regards,
Gourav Shenoy

From: DImuthu Upeksha <[email protected]>
Reply-To: "[email protected]" <[email protected]>
Date: Thursday, October 5, 2017 at 9:52 AM
To: "[email protected]" <[email protected]>
Subject: Re: Linked Container Services for Apache Airavata Components - Phase 1 
- Requirement identification

Hi Marlon,

Thanks for the input. I got your idea of availability mode and will keep in 
mind while designing the PoC. CI/CD is the one I have missed and thanks for 
pointing it out.

Thanks
Dimuthu

On Thu, Oct 5, 2017 at 7:04 PM, Pierce, Marlon 
<[email protected]<mailto:[email protected]>> wrote:
Thanks, Dimuthu, this is a good summary. Others may comment about Kafka, 
stateful versus stateless parts of Airavata, etc.  You may also find some of 
this discussion on the mailing list archives.

Active-active vs. active-passive is a good question, and we have typically 
thought of this in terms of individual Airavata components rather than the 
whole system.  Some components can be active-active (like a stateless 
application manager), while others (like the orchestrator example you give 
below) are stafefull and may be better as active-passive.

There is also the issue of system updates and continuous deployments, which 
could be added to your list.

Marlon


From: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Thursday, October 5, 2017 at 2:40 AM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Linked Container Services for Apache Airavata Components - Phase 1 - 
Requirement identification

Hi All,

Within last few days, I have been going through the requirements and design of 
current setup of Airavata and I identified following ares as the key focusing 
areas in the technology evaluation phase

Micorservices deployment platform (container management system)

Possible candidates: Google Kubernetes, Apache Mesos, Apache Helix
As the most of the operational units of Airavata is supposed to be moving into 
microservices based deployment pattern, having a unified deployment platform to 
manage those microservices will make the DevOps operations easier and faster. 
From the other hand, although writing and maintaining a single micro service is 
a somewhat straightforward way, making multiple microservies running, 
monitoring and maintaining the lifecycles manually in a production environment 
is an tiresome and complex operation to perform. Using such a deployment 
platform, we can easily automate lots of pain points that I have mentioned 
earlier.

Scalability

We need a solution that can easily scalable depending on the load condition of 
several parts of the system. For example, the workers in the post processing 
pipeline should be able scaled up and down depending on the events come into 
the message queue.

Availability

We need to support solution to be deployed in multiple geographically distant 
data centers. When evaluating container management systems, we should consider 
this is as a primary requirement. However one thing that I am not sure is the 
availability mode that Airavata normally expect. Is it a active-active mode or 
active-passive mode?

Service discovery

Once we move in to microservice based deployment pattern, there could be 
scenarios where we want service discovery for several use cases. For example, 
if we are going to scale up API Server to handle an increased load, we might 
have to put a load balancer in between the client and API Server instances. In 
that case, service discovery is essential to instruct the load balancer with 
healthy API Server endpoints which are currently running in the system.

Cluster coordination

Although micorservices are supposed to be stateless in most of the cases, we 
might have scenarios to feed some state to particular micorservices. For 
example if we are going to implement a microservice that perform Orchestrator's 
role, there could be issues if we keep multiple instances of it in several data 
centers to increase the availability. According to my understanding, there 
should be only one Orchestrator being running at a time as it is the one who 
takes decisions of the job execution process. So, if we are going to keep 
multiple instances of it running in the system, there should be an some sort of 
a leader election in between Orchestrator quorum.

Common messaging medium in between mocroservices

This might be out of the scope but I thought of sharing with the team to have 
an general idea. Idea was raised at the hip chat discussion with Marlon and 
Gaourav. Using a common messaging medium might enable microservices to 
communicate with in a decoupled manner which will increase the scalability of 
the system. For example there is a reference architecture that we can utilize 
with kafka based messaging medium [1], [2]. However I noticed in one paper that 
Kafka was previously rejected as writing clients was onerous. Please share your 
views on this as I'm not familiar with the existing fan out model based on AMQP 
and  pain points of it.

Those are the main areas that I have understood while going through Airavata 
current implementation and requirements stated in some of the research papers. 
Please let me know whether my understanding on above items are correct and 
suggestions are always welcome :)

[1] 
https://medium.com/@ulymarins/an-introduction-to-apache-kafka-and-microservices-communication-bf0a0966d63
[2] 
https://www.slideshare.net/ConfluentInc/microservices-in-the-apache-kafka-ecosystem

References

Marru, S., Gunathilake, L., Herath, C., Tangchaisin, P., Pierce, M., Mattmann, 
C., Singh, R., Gunarathne, T., Chinthaka, E., Gardler, R. and Slominski, A., 
2011, November. Apache airavata: a framework for distributed applications and 
computational workflows. In Proceedings of the 2011 ACM workshop on Gateway 
computing environments (pp. 21-28). ACM.

Nakandala, S., Pamidighantam, S., Yodage, S., Doshi, N., Abeysinghe, E., 
Kankanamalage, C.P., Marru, S. and Pierce, M., 2016, July. Anatomy of the 
SEAGrid Science Gateway. In Proceedings of the XSEDE16 Conference on Diversity, 
Big Data, and Science at Scale (p. 40). ACM.

Pierce, Marlon E., Suresh Marru, Lahiru Gunathilake, Don Kushan Wijeratne, 
Raminder Singh, Chathuri Wimalasena, Shameera Ratnayaka, and Sudhakar 
Pamidighantam. "Apache Airavata: design and directions of a science gateway 
framework." Concurrency and Computation: Practice and Experience 27, no. 16 
(2015): 4282-4291.

Pierce, Marlon, Suresh Marru, Borries Demeler, Raminderjeet Singh, and Gary 
Gorbet. "The apache airavata application programming interface: overview and 
evaluation with the UltraScan science gateway." In Proceedings of the 9th 
Gateway Computing Environments Workshop, pp. 25-29. IEEE Press, 2014.

Marru, Suresh, Marlon Pierce, Sudhakar Pamidighantam, and Chathuri Wimalasena. 
"Apache Airavata as a laboratory: architecture and case study for component- 
based gateway middleware." In Proceedings of the 1st Workshop on The Science of 
Cyberinfrastructure: Research, Experience, Applications and Models, pp. 19-26. 
ACM, 2015.

Thanks
Dimuthu

Reply via email to