> Also Kevin, I don't really know what's the database usage profile of your > app, but I'd immediately rule out installing the DB on the web servers, > especially having two MySQL instances on *each* machine that will be serving > PHP...
Why? I like the idea that each physical backend server can independently service a full request. Therefore, if the physical server is up and running, likely all the services running on the server will be up and running (especially if I don't push the server so hard that it is overloaded). Need to handle more requests? Simply add another backend server or upgrade the specs (and MAXCONN/server weight) of one or more existing servers. Kevin On Jan 3, 2013, at 5:06 AM, Pedro Mata-Mouros <[email protected]> wrote: > Also Kevin, I don't really know what's the database usage profile of your > app, but I'd immediately rule out installing the DB on the web servers, > especially having two MySQL instances on *each* machine that will be serving > PHP... > > Cheers, > Pedro. > > On 3 Jan 2013, at 09:25, KT Walrus <[email protected]> wrote: > >>> basically, you need persistence :) >> >> Well, I only need persistence to optimize traffic flow so the correct >> sessionDB is used (eliminating a network hop). But, the system will still >> function without persistence (in HAProxy) as the PHP code will know which >> sessionDB it needs to use for a given user. In this case, persistence can >> be ensured by the PHP code even if HAProxy routes to a suboptimal initial >> backend. >> >> In the multiple DC case, I will lose persistence if one DC fails. The >> forwarded requests to the other DC will have to establish a new session in a >> new sessionDB, but DC failure should be rare enough that I don't care about >> this. My site doesn't need 100% availability, just minimized user perceived >> downtime of minutes rather than hours. >> >> On Jan 3, 2013, at 3:49 AM, Baptiste <[email protected]> wrote: >> >>> basically, you need persistence :) >>> >>> On Thu, Jan 3, 2013 at 9:45 AM, KT Walrus <[email protected]> wrote: >>>> One more tweak… I think the frontend LBs could be made to distribute the >>>> load so that requests go to the backend that has the sessionDB that will >>>> be used for the request rather than simple RR (by using cookies). This >>>> would keep most requests handled entirely by a single backend server. I >>>> kind of like this, from an efficiency and simplicity point of view. >>>> >>>> Most setups seem to want you to place each individual component of the >>>> backend (HAProxy, Nginx/Varnish, PHP, and MySQL) in separate VPSs (in a >>>> "cloud" architecture). But, I'm thinking that it will simplify things if >>>> I don't use virtualization and have each backend capable of handling the >>>> entire request. If I need more capacity in the backend, I simply add >>>> another backend server that functions independently of the other backends >>>> (except for handling HA in times of high load where one backend forwards >>>> the excess requests to its next neighbor backend). >>>> >>>> I do have one problem in my proposed architecture. A sessionDB could, >>>> theoretically, get much more than MAXCONN connections (up to and including >>>> all current requests could use a single sessionDB). This is because once >>>> a sessionDB is selected for an individual user, all subsequent request >>>> from that user must be handled using this sessionDB. This means I have to >>>> keep MAXCONN low enough that if the sessionDB in the backend does have to >>>> handle all requests to all backends, the server will still function and >>>> not be overloaded. It would be nice if this wasn't the case, but I can't >>>> think of how to avoid this possibility. If I could, I could probably set >>>> MAXCONN to utilize 80% of the backend rather than a more conservative 50%, >>>> eventually, saving significant money in scale out. >>>> >>>> On Jan 3, 2013, at 2:56 AM, KT Walrus <[email protected]> wrote: >>>> >>>>> Thanks for the reply. >>>>> >>>>>> Why installing 2 layers of HAProxy??? >>>>>> A single one (on the 2 servers is enough). >>>>> >>>>> My thought was that the second layer of HAProxy would ensure that the >>>>> individual backend server would never have more than MAXCONN requests so >>>>> I know the server will never be overloaded possibly leading to the server >>>>> going down or taking too long to process a request. >>>>> >>>>> I want multiple active frontend lbs so that my architecture will scale >>>>> infinitely to many more frontends if necessary. If I eventually needed >>>>> more than 6 servers, I would set up another 6 servers (using the same >>>>> setup at 2 data centers for additional HA. >>>>> >>>>>> Since you're doing SSL, try >>>>>> to make it start multiple processes, a single one dedicated to HTTP >>>>>> and all other one for cyphering/deciphering processing… >>>>> >>>>> Yes. I planned on doing that. My 2 frontend servers are UP (4 cores) >>>>> while the 4 backend servers can be upgraded to DP (16 cores) and huge RAM >>>>> (256GBs). I've already purchased these servers. I expect that 1 >>>>> frontend server would be sufficient for a long time, but I want HA by >>>>> having the two frontends on separate independent power/ethernet >>>>> connections within the datacenter. >>>>> >>>>>> I'm not a fan of first algo, unless you pay the resource per number of >>>>>> backend server, which is not your case. >>>>> >>>>> I just thought "first" load balancing was perfect for "guarding" that an >>>>> individual backend server never exceeded MAXCONN concurrent requests. >>>>> The overhead should be minimal since this "guard" HAProxy almost always >>>>> will pass the request to localhost nginx/varnish. I need this "guard" >>>>> because there are multiple frontend LBs doing simple round robin to the >>>>> backends independently. This might become more of a possibility when and >>>>> if I need more LBs independently distributing requests to the backends. >>>>> >>>>>> Prefer using a hash in your case (even multiple hash with different >>>>>> backends and content switching), that way, your hit rate would be much >>>>>> better. >>>>> >>>>> I'm not so concerned about individual hit rate as I am about HA and >>>>> infinite scalability. It is relatively cheap to add a new server to >>>>> handle more backend or frontend load or split to placing some servers in >>>>> a new datacenter. I'd rather have my servers run at 50% capacity >>>>> (purchasing twice the hardware) if that means increased HA from having >>>>> the guard HAProxy's and never coming close to pushing them too hard that >>>>> individual pieces of the software/hardware stack start to fail. >>>>> >>>>>> no need to host a sorry page on a far away server, host it on your >>>>>> frontend LBs and HAProxy can deliver it once your server farm is >>>>>> full… >>>>> >>>>> That is true. I was really thinking that maybe the first Amazon >>>>> "overflow" server might be set up to actually have a full backend server >>>>> if the sorry page ever starts to be served by Amazon, I would simply >>>>> create one or more EC2 servers to take the temporary load. I actually >>>>> plan on implementing the website as EC2 instances (using this >>>>> architecture) until my Amazon bill goes over $500 a month at which time I >>>>> would go colo. >>>>> >>>>>> An other remark, it may be hard to troubleshoot such infra with 2 >>>>>> Active/active LBs. >>>>> >>>>> I think I have to deal with this, but since each LB is handling unique >>>>> VIPs (unless keepalived kicks in due to failure), I don't think there is >>>>> going to be that much trouble. >>>>> >>>>>> And using DNS rr does not prevent you from using keepalived to ensure >>>>>> HA between your 2 HAProxys. >>>>> >>>>> Yes. I am hoping this is the case. I eventually want at least two >>>>> geographic locations (east and west coast data centers) so 4 IPs in the >>>>> DNS to distribute to the closest datacenter. I use DNSMadeEasy which can >>>>> support both DNS Global Traffic Director (east coast and west coast IP >>>>> Anycast) and DNS Failover (incase one datacenter goes offline). >>>>> >>>>>> >>>>>> cheers >>>>>> >>>>>> >>>>>> On Thu, Jan 3, 2013 at 12:20 AM, KT Walrus <[email protected]> wrote: >>>>>>> I'm setting up a new website in the next month or two. Even though the >>>>>>> traffic won't require a scalable HA website, I'm going to start out as >>>>>>> if the website needs to support huge traffic so I can get some >>>>>>> experience running such a website. >>>>>>> >>>>>>> I'd like any feedback on what I am thinking of doing… >>>>>>> >>>>>>> As for hardware, I am colocating 6 servers at this time and plan to use >>>>>>> Amazon S3 to host the static files (which should grow quickly to 1TB or >>>>>>> 2TB of mostly images). 2 of the servers are going to be my frontend >>>>>>> load balancers running haproxy. The remaining 4 servers with be >>>>>>> nginx/varnish servers (nginx for the PHP/MySQL part of the site and >>>>>>> varnish to cache the Amazon S3 files to save bandwidth charges by >>>>>>> Amazon). >>>>>>> >>>>>>> I plan on doing DNS load balancing using pairs of A records for each >>>>>>> hosted domain that will point to each of my frontend haproxy load >>>>>>> balancers. Most traffic will be HTTPS, so I plan on having the >>>>>>> frontend load balancers to handle the SSL (using the new haproxy >>>>>>> support for SSL). >>>>>>> >>>>>>> The two load balancers will proxy to the 4 backend servers. These 4 >>>>>>> backend servers will run haproxy in front of nginx/varnish with load >>>>>>> balancing of "first" and a suitable MAXCONN. Server 1 haproxy will >>>>>>> first route to the localhost nginx/varnish and when MAXCONN connections >>>>>>> are active to the localhost, will forward the connection to Server 2 >>>>>>> haproxy. Server 2 and 3 will be set up similarly to first route >>>>>>> requests to localhost and when full, route subsequent requests to the >>>>>>> next server. Server 4 will route excess requests to a small Amazon EC2 >>>>>>> instance to return a "servers are all busy" page. Hopefully, I will be >>>>>>> able to add a 5th backend server at Amazon to handle the overload if it >>>>>>> looks like I really do have traffic that will fill all 4 backend >>>>>>> servers that I am colo'ing (I don't really expect this to ever be >>>>>>> necessary). >>>>>>> >>>>>>> Nginx will proxy to PHP on localhost and each localhost (of my 4 >>>>>>> backend servers) will have 2 MySQL instances - one for the main >>>>>>> Read-Only DB and one for a Read-Write SessionDB. PHP will go directly >>>>>>> to the main DB (not through HAProxy) and will use HAProxy to select the >>>>>>> proper SessionDB to use (each user session must use the same SessionDB >>>>>>> so the one a request needs might be on any of the backend servers). >>>>>>> Each SessionDB will be the master of one slave SessionDB on a different >>>>>>> backend server for handling the failure of the master (haproxy will >>>>>>> send requests to the slave SessionDB if the master is down or failing). >>>>>>> >>>>>>> So, each backend server will have haproxy to "first" balance HTTP to >>>>>>> nginx/varnish. The backends also have PHP and 3 instances of MySQL >>>>>>> (one for mainDB, one for master sessionDB, and one for another >>>>>>> backend's slave sessionDB). >>>>>>> >>>>>>> Also, the 2 frontend servers will be running separate instances of >>>>>>> haproxy. I hope to use keepalived to route the VIPs for one frontend >>>>>>> to the other frontend in case of failure. Or, should I use heartbeat? >>>>>>> There seems to be two HA solutions here. >>>>>>> >>>>>>> I know this is a very long description of what I am thinking of doing >>>>>>> and I thank you if you have read this far. I'm looking for any >>>>>>> comments on this setup. Especially, any comments on using "first" load >>>>>>> balancing/MAXCONN on the backend servers so that a request load >>>>>>> balanced from the frontend will keep the backend servers from >>>>>>> overloading (possibly bouncing a request from server 1 to server 2 to >>>>>>> server 3 to server 4 to EC2 "server busy" server) are especially >>>>>>> appreciated. Also, any comments on using pairs of master/slave >>>>>>> sessionDBs to provide high availability but still have session data >>>>>>> saved/retrieved for a given user from the same DB are appreciated. I >>>>>>> believe this setup will allow the load to be distributed evenly over >>>>>>> the 4 backends and only have the front end load balancers do simple >>>>>>> round robin without session stickiness. >>>>>>> >>>>>>> Kevin >>>>>>> >>>>>>> >>>>> >>>> >> >> >

