Hello Team,
As we discussed in previous emails about the infrastructure setup we
deployed on AWS using Terraform, I am now trying to tackle some real
world scenarios by extending our infrastructure to Open Stack
(JetStream). Here are some interesting observations and questions:
*1.* *Support for AutoScaling in Open Stack : *We used Auto Scaling
group for providing fault tolerance in AWS environment. I have gone
through the open stack docs, Auto Scaling can be achieved using Heat
Templates and Ceilometer alarms, but here the problem is, Heat templates
are not supported by Terraform, the argument from Terraform being that
both Heat Templates and Terraform serve the same purpose, why replicate
the functionality.
*2.* *Is AutoScaling the way to go for achieving Fault Tolerance ?*
*3. Is there any other way to achieve Auto Scaling in Open Stack ?
*
*4.* *Kubernetes**(https://www.kubernetes.io/): *Ditch all the cloud
specific technologies and go for a Kubernetes cluster, Kubernetes
clusters have in-built functionality for automating deployment, scaling,
and management of containerized applications.
Terraform is an awesome tool for provisioning and managing cloud
infrastructure, but as we heavily depend JetStream for Airavata
Deployment, There is a need to find answers to these questions. Any
input is highly appreciated.
Thanks and best regards,
*Anuj Bhandar*
MS Computer Science
Indiana University Bloomington
On 2/8/17 12:50 PM, Anuj Bhandar wrote:
Hello Dev,
Hope you are doing good !
As mentioned in the last mail, we thoroughly researched these load
balancing technology stacks
1. Consul (Service discovery), Consul Template + Ha-proxy for load
balancing
2. Consul (Service Discovery) and Fabio for load balancing
Here are the observations we made,
* Though consul + Fabio is an optimal stack for load balancing
HTTP/S traffic, but as the API uses Apache Thrift for RPC
interactions, it becomes messy to use Fabio to load balance plain
TCP traffic (TProtocol traffic is not supported by any
load-balancer, hence moving one layer below)
* We are setting up the other stack now, will soon post some test
graphs to get a good idea of performance.
The other criticism we received was, our Software Defined Environment
is not cloud agnostic, we heavily depend on AWS specific
functionalities to achieve our infrastructure setup, after some
research, we have come up with a solution to this problem. I agree
that, we use AWS AutoScale group to achieve fault tolerance, but
similar technologies are available in all major cloud platforms, what
we needed is a unified platform to manage all the clouds.
*Terraform - *(https://www.terraform.io/), This is an cloud agnostic
open source tool for building, changing, and versioning infrastructure
safely and efficiently. It *supports* all the major cloud platforms
including *AWS*, *Google Cloud*, *OpenStack* and also *In-House
servers*. This makes managing the cloud infrastructure easy and clean.
I have created an AWS stack using Terraform and everything looks good,
I will upload the code and related wiki soon to GitHub . Meanwhile, I
gets me really excited that Terraform allows *multiple providers at
once* i.e. We can provision AWS as well as OpenStack infrastructure
with a simple JSON file, all at one place. Moreover, Terraform
*maintains versioning of infrastructure code*, hence changing or
evolving the infrastructure to meet new demands is a simple task.
Please keep an eye on this GitHub Repository
(https://github.com/airavata-courses/spring17-API-Server) as new
content will be posted soon.
Thanks and best Regards,
*Anuj Bhandar*
MS Computer Science
Indiana University Bloomington
On 1/24/17 1:38 PM, Anuj Bhandar wrote:
Hello Dev,
Hope you are doing well !
As we all know, Airavata's popularity is growing as a middleware
provider for HPC clusters, it is time to upgrade our architecture to
meet the demands, described below is one such area which needs
attention and followed by some plausible solutions.
API gateway, which provides abstraction and security to several
underlying micro-services is a single point of failure for accessing
the middleware functionality, it needs to be addressed by introducing
a load balancer and a fault tolerant Software Defined Environment
(SDE). We are trying to implement some solutions and try out the
popular stacks, below is a brief description of the same:
*Environment (SDE)* : An AWS SDE with one AutoScaling group,
containing two spot instances to deploy the api-gateway, fault
tolerance is handled inherently with AutoScaling feature i.e, in the
event of failure the a new instance is spawned automatically with all
the data needed to start the server upon startup, for more info on
this, a detailed wiki is written :
https://github.com/airavata-courses/spring17-API-Server/wiki/Environment-(SDE).
Please note that this environment is meant for development and not
production ready, more features will be added later.
*Load-balancing*: As stage is set for deploying load balancer, below
are some of the plausible combinations we think can be suitable for
our scenario,
1. *Consul* (https://www.consul.io/) for service discovery and
*Consul Template + HAproxy* for load balancing.
2. *Consul + Fabio:* Fabio is a open source software router/load
balancer that directly interacts with Consul to load balance
services. It dynamically updates the services and doesn't require
restart for configuration changes. In that sense it provides true
zero downtime.
3. *Serf + HAproxy: *Serf is one of the core algorithms used in
Consul, in a sense that all servers that have serf installed
create a mesh network and each member is aware of every other
member. This is a highly available network with no masters or
slaves, only peers. So there is no single point of failure.
Your valuable feedback is needed on above mentioned stacks, we are
trying to setup the first two options to compare the results, will
keep you all updated on the progress.
Thanks and best regards,
*Anuj Bhandar*
MS Computer Science
Indiana University Bloomington