Re: My Scalable Architecture using HAProxy

Baptiste Thu, 03 Jan 2013 00:45:45 -0800

On Thu, Jan 3, 2013 at 8:56 AM, KT Walrus <[email protected]> wrote:
> Thanks for the reply.
>
>> Why installing 2 layers of HAProxy???
>> A single one (on the 2 servers is enough).
>
> My thought was that the second layer of HAProxy would ensure that the 
> individual backend server would never have more than MAXCONN requests so I 
> know the server will never be overloaded possibly leading to the server going 
> down or taking too long to process a request.
>
 cf below


>> I'm not a fan of first algo, unless you pay the resource per number of
>> backend server, which is not your case.
>
> I just thought "first" load balancing was perfect for "guarding" that an 
> individual backend server never exceeded MAXCONN concurrent requests.  The 
> overhead should be minimal since this "guard" HAProxy almost always will pass 
> the request to localhost nginx/varnish.  I need this "guard" because there 
> are multiple frontend LBs doing simple round robin to the backends 
> independently.  This might become more of a possibility when and if I need 
> more LBs independently distributing requests to the backends.
>

This is the role of the first layer of load-balancer.
The maxconn features make it smart: as soon as one of your backend
server reaches its maxconn, it's pulled out from the LB algorithm
untill the number of connections decrease.
I definitively would not use 2 layers of HAProxy and would not use
first algo too...

note that, an active/active HAProxy can be designed (with tricked
configuration) to adapt its maxconn based on the number of LBs
available.


>> Prefer using a hash in your case (even multiple hash with different
>> backends and content switching), that way, your hit rate would be much
>> better.
>
> I'm not so concerned about individual hit rate as I am about HA and infinite 
> scalability.  It is relatively cheap to add a new server to handle more 
> backend or frontend load or split to placing some servers in a new 
> datacenter.  I'd rather have my servers run at 50% capacity (purchasing twice 
> the hardware) if that means increased HA from having the guard HAProxy's and 
> never coming close to pushing them too hard that individual pieces of the 
> software/hardware stack start to fail.
>

This is not a question of cost, this is a question of efficiency and
response time (on the client side)

>> An other remark, it may be hard to troubleshoot such infra with 2
>> Active/active LBs.
>
> I think I have to deal with this, but since each LB is handling unique VIPs 
> (unless keepalived kicks in due to failure), I don't think there is going to 
> be that much trouble.

no, you said your domain name would be propagated over 2 dns A entries...
so a client could arrive at any time on any LB (modulo the DNS TTL)

>> And using DNS rr does not prevent you from using keepalived to ensure
>> HA between your 2 HAProxys.
>
> Yes.  I am hoping this is the case.  I eventually want at least two 
> geographic locations (east and west coast data centers) so 4 IPs in the DNS 
> to distribute to the closest datacenter.  I use DNSMadeEasy which can support 
> both DNS Global Traffic Director (east coast and west coast IP Anycast) and 
> DNS Failover (incase one datacenter goes offline).
>

That is good.
To avoid loosing traffic, HAProxy could also be used to forward
traffic to the other DC if all the server in its local DC are
unavailable.

>>
>> cheers
>>
>>
>> On Thu, Jan 3, 2013 at 12:20 AM, KT Walrus <[email protected]> wrote:
>>> I'm setting up a new website in the next month or two.  Even though the 
>>> traffic won't require a scalable HA website, I'm going to start out as if 
>>> the website needs to support huge traffic so I can get some experience 
>>> running such a website.
>>>
>>> I'd like any feedback on what I am thinking of doing…
>>>
>>> As for hardware, I am colocating 6 servers at this time and plan to use 
>>> Amazon S3 to host the static files (which should grow quickly to 1TB or 2TB 
>>> of mostly images).  2 of the servers are going to be my frontend load 
>>> balancers running haproxy.  The remaining 4 servers with be nginx/varnish 
>>> servers (nginx for the PHP/MySQL part of the site and varnish to cache the 
>>> Amazon S3 files to save bandwidth charges by Amazon).
>>>
>>> I plan on doing DNS load balancing using pairs of A records for each hosted 
>>> domain that will point to each of my frontend haproxy load balancers.  Most 
>>> traffic will be HTTPS, so I plan on having the frontend load balancers to 
>>> handle the SSL (using the new haproxy support for SSL).
>>>
>>> The two load balancers will proxy to the 4 backend servers.  These 4 
>>> backend servers will run haproxy in front of nginx/varnish with load 
>>> balancing of "first" and a suitable MAXCONN.  Server 1 haproxy will first 
>>> route to the localhost nginx/varnish and when MAXCONN connections are 
>>> active to the localhost, will forward the connection to Server 2 haproxy.  
>>> Server 2 and 3 will be set up similarly to first route requests to 
>>> localhost and when full, route subsequent requests to the next server.  
>>> Server 4 will route excess requests to a small Amazon EC2 instance to 
>>> return a "servers are all busy" page.  Hopefully, I will be able to add a 
>>> 5th backend server at Amazon to handle the overload if it looks like I 
>>> really do have traffic that will fill all 4 backend servers that I am 
>>> colo'ing (I don't really expect this to ever be necessary).
>>>
>>> Nginx will proxy to PHP on localhost and each localhost (of my 4 backend 
>>> servers) will have 2 MySQL instances - one for the main Read-Only DB and 
>>> one for a Read-Write SessionDB.  PHP will go directly to the main DB (not 
>>> through HAProxy) and will use HAProxy to select the proper SessionDB to use 
>>> (each user session must use the same SessionDB so the one a request needs 
>>> might be on any of the backend servers).  Each SessionDB will be the master 
>>> of one slave SessionDB on a different backend server for handling the 
>>> failure of the master (haproxy will send requests to the slave SessionDB if 
>>> the master is down or  failing).
>>>
>>> So, each backend server will have haproxy to "first" balance HTTP to 
>>> nginx/varnish.  The backends also have PHP and 3 instances of MySQL (one 
>>> for mainDB, one for master sessionDB, and one for another backend's slave 
>>> sessionDB).
>>>
>>> Also, the 2 frontend servers will be running separate instances of haproxy. 
>>>  I hope to use keepalived to route the VIPs for one frontend to the other 
>>> frontend in case of failure.  Or, should I use heartbeat?  There seems to 
>>> be two HA solutions here.
>>>
>>> I know this is a very long description of what I am thinking of doing and I 
>>> thank you if you have read this far.  I'm looking for any comments on this 
>>> setup.  Especially, any comments on using "first" load balancing/MAXCONN on 
>>> the backend servers so that a request load balanced from the frontend will 
>>> keep the backend servers from overloading (possibly bouncing a request from 
>>> server 1 to server 2 to server 3 to server 4 to EC2 "server busy" server) 
>>> are especially appreciated.  Also, any comments on using pairs of 
>>> master/slave sessionDBs to provide high availability but still have session 
>>> data saved/retrieved for a given user from the same DB are appreciated.  I 
>>> believe this setup will allow the load to be distributed evenly over the 4 
>>> backends and only have the front end load balancers do simple round robin 
>>> without session stickiness.
>>>
>>> Kevin
>>>
>>>
>

Re: My Scalable Architecture using HAProxy

Reply via email to