Hello Dev,
Hope you are doing good !
As mentioned in the last mail, we thoroughly researched these load
balancing technology stacks
1. Consul (Service discovery), Consul Template + Ha-proxy for load
balancing
2. Consul (Service Discovery) and Fabio for load balancing
Here are the observations we made,
* Though consul + Fabio is an optimal stack for load balancing HTTP/S
traffic, but as the API uses Apache Thrift for RPC interactions, it
becomes messy to use Fabio to load balance plain TCP traffic
(TProtocol traffic is not supported by any load-balancer, hence
moving one layer below)
* We are setting up the other stack now, will soon post some test
graphs to get a good idea of performance.
The other criticism we received was, our Software Defined Environment is
not cloud agnostic, we heavily depend on AWS specific functionalities to
achieve our infrastructure setup, after some research, we have come up
with a solution to this problem. I agree that, we use AWS AutoScale
group to achieve fault tolerance, but similar technologies are available
in all major cloud platforms, what we needed is a unified platform to
manage all the clouds.
*Terraform - *(https://www.terraform.io/), This is an cloud agnostic
open source tool for building, changing, and versioning infrastructure
safely and efficiently. It *supports* all the major cloud platforms
including *AWS*, *Google Cloud*, *OpenStack* and also *In-House
servers*. This makes managing the cloud infrastructure easy and clean. I
have created an AWS stack using Terraform and everything looks good, I
will upload the code and related wiki soon to GitHub . Meanwhile, I gets
me really excited that Terraform allows *multiple providers at once*
i.e. We can provision AWS as well as OpenStack infrastructure with a
simple JSON file, all at one place. Moreover, Terraform *maintains
versioning of infrastructure code*, hence changing or evolving the
infrastructure to meet new demands is a simple task.
Please keep an eye on this GitHub Repository
(https://github.com/airavata-courses/spring17-API-Server) as new content
will be posted soon.
Thanks and best Regards,
*Anuj Bhandar*
MS Computer Science
Indiana University Bloomington
On 1/24/17 1:38 PM, Anuj Bhandar wrote:
Hello Dev,
Hope you are doing well !
As we all know, Airavata's popularity is growing as a middleware
provider for HPC clusters, it is time to upgrade our architecture to
meet the demands, described below is one such area which needs
attention and followed by some plausible solutions.
API gateway, which provides abstraction and security to several
underlying micro-services is a single point of failure for accessing
the middleware functionality, it needs to be addressed by introducing
a load balancer and a fault tolerant Software Defined Environment
(SDE). We are trying to implement some solutions and try out the
popular stacks, below is a brief description of the same:
*Environment (SDE)* : An AWS SDE with one AutoScaling group,
containing two spot instances to deploy the api-gateway, fault
tolerance is handled inherently with AutoScaling feature i.e, in the
event of failure the a new instance is spawned automatically with all
the data needed to start the server upon startup, for more info on
this, a detailed wiki is written :
https://github.com/airavata-courses/spring17-API-Server/wiki/Environment-(SDE).
Please note that this environment is meant for development and not
production ready, more features will be added later.
*Load-balancing*: As stage is set for deploying load balancer, below
are some of the plausible combinations we think can be suitable for
our scenario,
1. *Consul* (https://www.consul.io/) for service discovery and
*Consul Template + HAproxy* for load balancing.
2. *Consul + Fabio:* Fabio is a open source software router/load
balancer that directly interacts with Consul to load balance
services. It dynamically updates the services and doesn't require
restart for configuration changes. In that sense it provides true
zero downtime.
3. *Serf + HAproxy: *Serf is one of the core algorithms used in
Consul, in a sense that all servers that have serf installed
create a mesh network and each member is aware of every other
member. This is a highly available network with no masters or
slaves, only peers. So there is no single point of failure.
Your valuable feedback is needed on above mentioned stacks, we are
trying to setup the first two options to compare the results, will
keep you all updated on the progress.
Thanks and best regards,
*Anuj Bhandar*
MS Computer Science
Indiana University Bloomington