Hello, On Wed, Jan 13, 2010 at 07:12:45PM -0800, Jose Avila(Tachu) wrote: > I have a theory on a recent issue i've been experiencing for the last 2 days > that i would like some clarification on. > I have a load balancer with 30 backend servers with a roundrobin balance line > and a maxconn > per server of 75. from my understanding is that the round robing will go in > turns one by one. > once the balancer starts to queue up ej reaches maximum of 75 requests on all > servers. will > the queue wait for the next available server on the stack or will it wait for > the actual > server that was next in turn ??
The roundrobin applies only between servers that are not full, so when a request comes in, if there is a server with available connection slots, the request may be served by that server. And if all servers are full, the request will be queued in the backend's queue, where it will be picked up by the first server which releases one connection. However, if you enable cookie-based persistence, requests which contain a cookie designating a specific server will be sent to that server, and if it's full, will be sent to that server's queue where only that server can pick them up. This is normal because in this case we want that only that server processes the request. > if so 1 slow request that takes 6 seconds would mean that ever single request > in the queue will take at least 6 seconds even if the server response time > for all those queued up requests once server would be say 200ms causing a > snowball effect of slow requests? This is the case only for that server's queue, which only contains requests which want to be processed exclusively by that server. > Am i seeing this right? and if so would it be better to avoid this issue to > change my balance to leastconn so it takes the next avaiable one instead of > waiting for the next in the round robin?? no, because once the queues are full, RR or LC work similarly because requests will be sent to the backend's queue and picked up by the first server which frees a place. > The issue you see at around 11:45 was me changing the max conn per server > from 75 to 300 affectively clearing the queue increaseing the avg server > response time from 1s to about 4 secons but overall dropping my total > response time to client to about 4 seconds This means that there are some processing on your servers that wait for external data, typically a database, and which prevent your servers from working at full speed at only 75 conns. CPU-bound tasks are better handled with a maxconn close to the number of CPU cores of the server. I/O-bound tasks generally want higher values in order to make the server work on something else while it's waiting for some data. Do you know some specific requests which can be slow ? If so, it makes sense to configure two distinct backends, one with few connections for the slow requests, and one with more connections for the fast requests, so that the fast requests all share a same queue and are not delayed by slow ones. Regards, Willy

