Re: My Scalable Architecture using HAProxy

KT Walrus Wed, 02 Jan 2013 23:56:54 -0800

Thanks for the reply.

> Why installing 2 layers of HAProxy???
> A single one (on the 2 servers is enough).

My thought was that the second layer of HAProxy would ensure that the 
individual backend server would never have more than MAXCONN requests so I know 
the server will never be overloaded possibly leading to the server going down 
or taking too long to process a request.

I want multiple active frontend lbs so that my architecture will scale 
infinitely to many more frontends if necessary.  If I  eventually needed more 
than 6 servers, I would set up another 6 servers (using the same setup at 2 
data centers for additional HA.

> Since you're doing SSL, try
> to make it start multiple processes, a single one dedicated to HTTP
> and all other one for cyphering/deciphering processing…

Yes.  I planned on doing that.  My 2 frontend servers are UP (4 cores) while 
the 4 backend servers can be upgraded to DP (16 cores) and huge RAM (256GBs).  
I've already purchased these servers.  I expect that 1 frontend server would be 
sufficient for a long time, but I want HA by having the two frontends on 
separate independent power/ethernet connections within the datacenter.

> I'm not a fan of first algo, unless you pay the resource per number of
> backend server, which is not your case.

I just thought "first" load balancing was perfect for "guarding" that an 
individual backend server never exceeded MAXCONN concurrent requests.  The 
overhead should be minimal since this "guard" HAProxy almost always will pass 
the request to localhost nginx/varnish.  I need this "guard" because there are 
multiple frontend LBs doing simple round robin to the backends independently.  
This might become more of a possibility when and if I need more LBs 
independently distributing requests to the backends.

> Prefer using a hash in your case (even multiple hash with different
> backends and content switching), that way, your hit rate would be much
> better.

I'm not so concerned about individual hit rate as I am about HA and infinite 
scalability.  It is relatively cheap to add a new server to handle more backend 
or frontend load or split to placing some servers in a new datacenter.  I'd 
rather have my servers run at 50% capacity (purchasing twice the hardware) if 
that means increased HA from having the guard HAProxy's and never coming close 
to pushing them too hard that individual pieces of the software/hardware stack 
start to fail.

> no need to host a sorry page on a far away server, host it on your
> frontend LBs and HAProxy can deliver it once your server farm is
> full…

That is true.  I was really thinking that maybe the first Amazon "overflow" 
server might be set up to actually have a full backend server if the sorry page 
ever starts to be served by Amazon, I would simply create one or more EC2 
servers to take the temporary load.  I actually plan on implementing the 
website as EC2 instances (using this architecture) until my Amazon bill goes 
over $500 a month at which time I would go colo.

> An other remark, it may be hard to troubleshoot such infra with 2
> Active/active LBs.

I think I have to deal with this, but since each LB is handling unique VIPs 
(unless keepalived kicks in due to failure), I don't think there is going to be 
that much trouble.

> And using DNS rr does not prevent you from using keepalived to ensure
> HA between your 2 HAProxys.

Yes.  I am hoping this is the case.  I eventually want at least two geographic 
locations (east and west coast data centers) so 4 IPs in the DNS to distribute 
to the closest datacenter.  I use DNSMadeEasy which can support both DNS Global 
Traffic Director (east coast and west coast IP Anycast) and DNS Failover 
(incase one datacenter goes offline).

> 
> cheers
> 
> 
> On Thu, Jan 3, 2013 at 12:20 AM, KT Walrus <[email protected]> wrote:
>> I'm setting up a new website in the next month or two.  Even though the 
>> traffic won't require a scalable HA website, I'm going to start out as if 
>> the website needs to support huge traffic so I can get some experience 
>> running such a website.
>> 
>> I'd like any feedback on what I am thinking of doing…
>> 
>> As for hardware, I am colocating 6 servers at this time and plan to use 
>> Amazon S3 to host the static files (which should grow quickly to 1TB or 2TB 
>> of mostly images).  2 of the servers are going to be my frontend load 
>> balancers running haproxy.  The remaining 4 servers with be nginx/varnish 
>> servers (nginx for the PHP/MySQL part of the site and varnish to cache the 
>> Amazon S3 files to save bandwidth charges by Amazon).
>> 
>> I plan on doing DNS load balancing using pairs of A records for each hosted 
>> domain that will point to each of my frontend haproxy load balancers.  Most 
>> traffic will be HTTPS, so I plan on having the frontend load balancers to 
>> handle the SSL (using the new haproxy support for SSL).
>> 
>> The two load balancers will proxy to the 4 backend servers.  These 4 backend 
>> servers will run haproxy in front of nginx/varnish with load balancing of 
>> "first" and a suitable MAXCONN.  Server 1 haproxy will first route to the 
>> localhost nginx/varnish and when MAXCONN connections are active to the 
>> localhost, will forward the connection to Server 2 haproxy.  Server 2 and 3 
>> will be set up similarly to first route requests to localhost and when full, 
>> route subsequent requests to the next server.  Server 4 will route excess 
>> requests to a small Amazon EC2 instance to return a "servers are all busy" 
>> page.  Hopefully, I will be able to add a 5th backend server at Amazon to 
>> handle the overload if it looks like I really do have traffic that will fill 
>> all 4 backend servers that I am colo'ing (I don't really expect this to ever 
>> be necessary).
>> 
>> Nginx will proxy to PHP on localhost and each localhost (of my 4 backend 
>> servers) will have 2 MySQL instances - one for the main Read-Only DB and one 
>> for a Read-Write SessionDB.  PHP will go directly to the main DB (not 
>> through HAProxy) and will use HAProxy to select the proper SessionDB to use 
>> (each user session must use the same SessionDB so the one a request needs 
>> might be on any of the backend servers).  Each SessionDB will be the master 
>> of one slave SessionDB on a different backend server for handling the 
>> failure of the master (haproxy will send requests to the slave SessionDB if 
>> the master is down or  failing).
>> 
>> So, each backend server will have haproxy to "first" balance HTTP to 
>> nginx/varnish.  The backends also have PHP and 3 instances of MySQL (one for 
>> mainDB, one for master sessionDB, and one for another backend's slave 
>> sessionDB).
>> 
>> Also, the 2 frontend servers will be running separate instances of haproxy.  
>> I hope to use keepalived to route the VIPs for one frontend to the other 
>> frontend in case of failure.  Or, should I use heartbeat?  There seems to be 
>> two HA solutions here.
>> 
>> I know this is a very long description of what I am thinking of doing and I 
>> thank you if you have read this far.  I'm looking for any comments on this 
>> setup.  Especially, any comments on using "first" load balancing/MAXCONN on 
>> the backend servers so that a request load balanced from the frontend will 
>> keep the backend servers from overloading (possibly bouncing a request from 
>> server 1 to server 2 to server 3 to server 4 to EC2 "server busy" server) 
>> are especially appreciated.  Also, any comments on using pairs of 
>> master/slave sessionDBs to provide high availability but still have session 
>> data saved/retrieved for a given user from the same DB are appreciated.  I 
>> believe this setup will allow the load to be distributed evenly over the 4 
>> backends and only have the front end load balancers do simple round robin 
>> without session stickiness.
>> 
>> Kevin
>> 
>>

Re: My Scalable Architecture using HAProxy

Reply via email to