Hello Dev,

Hope you are doing good !

As mentioned in the last mail, we thoroughly researched these load balancing technology stacks

1. Consul (Service discovery), Consul Template + Ha-proxy for load
   balancing
2. Consul (Service Discovery) and Fabio for load balancing

Here are the observations we made,

 * Though consul + Fabio is an optimal stack for load balancing HTTP/S
   traffic, but as the API uses Apache Thrift for RPC interactions, it
   becomes messy to use Fabio to load balance plain TCP traffic
   (TProtocol traffic is not supported by any load-balancer, hence
   moving one layer below)
 * We are setting up the other stack now, will soon post some test
   graphs to get a good idea of performance.

The other criticism we received was, our Software Defined Environment is not cloud agnostic, we heavily depend on AWS specific functionalities to achieve our infrastructure setup, after some research, we have come up with a solution to this problem. I agree that, we use AWS AutoScale group to achieve fault tolerance, but similar technologies are available in all major cloud platforms, what we needed is a unified platform to manage all the clouds.

*Terraform - *(https://www.terraform.io/), This is an cloud agnostic open source tool for building, changing, and versioning infrastructure safely and efficiently. It *supports* all the major cloud platforms including *AWS*, *Google Cloud*, *OpenStack* and also *In-House servers*. This makes managing the cloud infrastructure easy and clean. I have created an AWS stack using Terraform and everything looks good, I will upload the code and related wiki soon to GitHub . Meanwhile, I gets me really excited that Terraform allows *multiple providers at once* i.e. We can provision AWS as well as OpenStack infrastructure with a simple JSON file, all at one place. Moreover, Terraform *maintains versioning of infrastructure code*, hence changing or evolving the infrastructure to meet new demands is a simple task.

Please keep an eye on this GitHub Repository (https://github.com/airavata-courses/spring17-API-Server) as new content will be posted soon.


Thanks and best Regards,

*Anuj Bhandar*
MS Computer Science
Indiana University Bloomington


On 1/24/17 1:38 PM, Anuj Bhandar wrote:

Hello Dev,

Hope you are doing well !

As we all know, Airavata's popularity is growing as a middleware provider for HPC clusters, it is time to upgrade our architecture to meet the demands, described below is one such area which needs attention and followed by some plausible solutions.

API gateway, which provides abstraction and security to several underlying micro-services is a single point of failure for accessing the middleware functionality, it needs to be addressed by introducing a load balancer and a fault tolerant Software Defined Environment (SDE). We are trying to implement some solutions and try out the popular stacks, below is a brief description of the same:

*Environment (SDE)* : An AWS SDE with one AutoScaling group, containing two spot instances to deploy the api-gateway, fault tolerance is handled inherently with AutoScaling feature i.e, in the event of failure the a new instance is spawned automatically with all the data needed to start the server upon startup, for more info on this, a detailed wiki is written : https://github.com/airavata-courses/spring17-API-Server/wiki/Environment-(SDE). Please note that this environment is meant for development and not production ready, more features will be added later.

*Load-balancing*: As stage is set for deploying load balancer, below are some of the plausible combinations we think can be suitable for our scenario,

 1. *Consul* (https://www.consul.io/) for service discovery and
    *Consul Template + HAproxy* for load balancing.
 2. *Consul + Fabio:* Fabio is a open source software router/load
    balancer that directly interacts with Consul to load balance
    services. It dynamically updates the services and doesn't require
    restart for configuration changes. In that sense it provides true
    zero downtime.
 3. *Serf + HAproxy: *Serf is one of the core algorithms used in
    Consul, in a sense that all servers that have serf installed
    create a mesh network and each member is aware of every other
    member. This is a highly available network with no masters or
    slaves, only peers. So there is no single point of failure.

Your valuable feedback is needed on above mentioned stacks, we are trying to setup the first two options to compare the results, will keep you all updated on the progress.

Thanks and best regards,

*Anuj Bhandar*
MS Computer Science
Indiana University Bloomington


Reply via email to