On Thu, Jan 3, 2013 at 8:56 AM, KT Walrus <[email protected]> wrote: > Thanks for the reply. > >> Why installing 2 layers of HAProxy??? >> A single one (on the 2 servers is enough). > > My thought was that the second layer of HAProxy would ensure that the > individual backend server would never have more than MAXCONN requests so I > know the server will never be overloaded possibly leading to the server going > down or taking too long to process a request. > cf below
>> I'm not a fan of first algo, unless you pay the resource per number of >> backend server, which is not your case. > > I just thought "first" load balancing was perfect for "guarding" that an > individual backend server never exceeded MAXCONN concurrent requests. The > overhead should be minimal since this "guard" HAProxy almost always will pass > the request to localhost nginx/varnish. I need this "guard" because there > are multiple frontend LBs doing simple round robin to the backends > independently. This might become more of a possibility when and if I need > more LBs independently distributing requests to the backends. > This is the role of the first layer of load-balancer. The maxconn features make it smart: as soon as one of your backend server reaches its maxconn, it's pulled out from the LB algorithm untill the number of connections decrease. I definitively would not use 2 layers of HAProxy and would not use first algo too... note that, an active/active HAProxy can be designed (with tricked configuration) to adapt its maxconn based on the number of LBs available. >> Prefer using a hash in your case (even multiple hash with different >> backends and content switching), that way, your hit rate would be much >> better. > > I'm not so concerned about individual hit rate as I am about HA and infinite > scalability. It is relatively cheap to add a new server to handle more > backend or frontend load or split to placing some servers in a new > datacenter. I'd rather have my servers run at 50% capacity (purchasing twice > the hardware) if that means increased HA from having the guard HAProxy's and > never coming close to pushing them too hard that individual pieces of the > software/hardware stack start to fail. > This is not a question of cost, this is a question of efficiency and response time (on the client side) >> An other remark, it may be hard to troubleshoot such infra with 2 >> Active/active LBs. > > I think I have to deal with this, but since each LB is handling unique VIPs > (unless keepalived kicks in due to failure), I don't think there is going to > be that much trouble. no, you said your domain name would be propagated over 2 dns A entries... so a client could arrive at any time on any LB (modulo the DNS TTL) >> And using DNS rr does not prevent you from using keepalived to ensure >> HA between your 2 HAProxys. > > Yes. I am hoping this is the case. I eventually want at least two > geographic locations (east and west coast data centers) so 4 IPs in the DNS > to distribute to the closest datacenter. I use DNSMadeEasy which can support > both DNS Global Traffic Director (east coast and west coast IP Anycast) and > DNS Failover (incase one datacenter goes offline). > That is good. To avoid loosing traffic, HAProxy could also be used to forward traffic to the other DC if all the server in its local DC are unavailable. >> >> cheers >> >> >> On Thu, Jan 3, 2013 at 12:20 AM, KT Walrus <[email protected]> wrote: >>> I'm setting up a new website in the next month or two. Even though the >>> traffic won't require a scalable HA website, I'm going to start out as if >>> the website needs to support huge traffic so I can get some experience >>> running such a website. >>> >>> I'd like any feedback on what I am thinking of doing… >>> >>> As for hardware, I am colocating 6 servers at this time and plan to use >>> Amazon S3 to host the static files (which should grow quickly to 1TB or 2TB >>> of mostly images). 2 of the servers are going to be my frontend load >>> balancers running haproxy. The remaining 4 servers with be nginx/varnish >>> servers (nginx for the PHP/MySQL part of the site and varnish to cache the >>> Amazon S3 files to save bandwidth charges by Amazon). >>> >>> I plan on doing DNS load balancing using pairs of A records for each hosted >>> domain that will point to each of my frontend haproxy load balancers. Most >>> traffic will be HTTPS, so I plan on having the frontend load balancers to >>> handle the SSL (using the new haproxy support for SSL). >>> >>> The two load balancers will proxy to the 4 backend servers. These 4 >>> backend servers will run haproxy in front of nginx/varnish with load >>> balancing of "first" and a suitable MAXCONN. Server 1 haproxy will first >>> route to the localhost nginx/varnish and when MAXCONN connections are >>> active to the localhost, will forward the connection to Server 2 haproxy. >>> Server 2 and 3 will be set up similarly to first route requests to >>> localhost and when full, route subsequent requests to the next server. >>> Server 4 will route excess requests to a small Amazon EC2 instance to >>> return a "servers are all busy" page. Hopefully, I will be able to add a >>> 5th backend server at Amazon to handle the overload if it looks like I >>> really do have traffic that will fill all 4 backend servers that I am >>> colo'ing (I don't really expect this to ever be necessary). >>> >>> Nginx will proxy to PHP on localhost and each localhost (of my 4 backend >>> servers) will have 2 MySQL instances - one for the main Read-Only DB and >>> one for a Read-Write SessionDB. PHP will go directly to the main DB (not >>> through HAProxy) and will use HAProxy to select the proper SessionDB to use >>> (each user session must use the same SessionDB so the one a request needs >>> might be on any of the backend servers). Each SessionDB will be the master >>> of one slave SessionDB on a different backend server for handling the >>> failure of the master (haproxy will send requests to the slave SessionDB if >>> the master is down or failing). >>> >>> So, each backend server will have haproxy to "first" balance HTTP to >>> nginx/varnish. The backends also have PHP and 3 instances of MySQL (one >>> for mainDB, one for master sessionDB, and one for another backend's slave >>> sessionDB). >>> >>> Also, the 2 frontend servers will be running separate instances of haproxy. >>> I hope to use keepalived to route the VIPs for one frontend to the other >>> frontend in case of failure. Or, should I use heartbeat? There seems to >>> be two HA solutions here. >>> >>> I know this is a very long description of what I am thinking of doing and I >>> thank you if you have read this far. I'm looking for any comments on this >>> setup. Especially, any comments on using "first" load balancing/MAXCONN on >>> the backend servers so that a request load balanced from the frontend will >>> keep the backend servers from overloading (possibly bouncing a request from >>> server 1 to server 2 to server 3 to server 4 to EC2 "server busy" server) >>> are especially appreciated. Also, any comments on using pairs of >>> master/slave sessionDBs to provide high availability but still have session >>> data saved/retrieved for a given user from the same DB are appreciated. I >>> believe this setup will allow the load to be distributed evenly over the 4 >>> backends and only have the front end load balancers do simple round robin >>> without session stickiness. >>> >>> Kevin >>> >>> >

