> On 13 окт. 2015 г., at 18:47, Willy Tarreau <[email protected]> wrote: > > Hi Dmitry, > > sorry for the delay, I really didn't have time to analyse the config > you sent me. > > A few points below : > > On Wed, Oct 07, 2015 at 04:18:20PM +0300, Dmitry Sivachenko wrote: >> Oct 7 08:33:03 srv1 haproxy[77565]: unix:1 [07/Oct/2015:08:33:02.428] >> MT-front MT_RU_EN-back/<NOSRV> 0/1000/-1/-1/1000 503 212 - - sQ-- >> 125/124/108/0/0 0/28 "POST /some/url HTTP/1.1" >> (many similar at one moment) >> >> Common part in these errors is "1000" in Tw and Tt, and "sQ--" termination >> state. >> >> Here is the relevant part on my config (I can post more if needed): >> >> defaults >> balance roundrobin >> maxconn 10000 >> timeout queue 1s >> fullconn 3000 >> default-server inter 5s downinter 1s fastinter 500ms fall 3 rise 1 >> slowstart 60s maxqueue 1 minconn 5 maxconn 150 >> >> backend MT_RU_EN-back >> mode http >> timeout server 30s >> server mt1-34 mt1-34:19016 track MT-back/mt1-34 weight 38 >> server mt1-35 mt1-35:19016 track MT-back/mt1-35 weight 38 >> <total 18 of similar servers> >> >> So this error log indicates that request was sitting in the queue for >> timeout queue==1s and his turn did not come. >> >> In the stats web interface for MT_RU_EN-back backend I see the following >> numbers: >> >> Sessions: limit=3000, max=126 (for the whole backend) >> Limit=150, max=5 or 6 (for each server) >> >> If I understand minconn/maxconn meaning right, each server should accept up >> to min(150, 3000/18) connections >> >> So according to stats the load were far from limits. > > No, look, the log says there were 108 connections on the backend. This > is important since you're using minconn so you're using dynamic queueing. > This means that the effective limit when handling this request was around > maxconn*currconn/fullconn, which is 150*108/3000 = 5.4 so the limit was > at 5 connections. Thus the limit for this server was indeed reached. > > Playing with minconn and fullconn is hard and strongly advised against, > unless you know exactly how to tune it. You must always ensure that a > normal load will be handled without queuing (or with a very small queue), > and that maxconn will quickly be reached to handle high traffic. I tend to > consider that an efficient fullconn is around 10% of the maximum load the > farm may have to deal with (which is the default value IIRC). Regarding > minconn, it's interesting not to set it too low. A good rule of thumb is > to estimate what would happen at 10% of fullconn (1% of the max load). > In your case, at 300 concurrent connections, your servers will accept > 15 connections each. I have no idea whether this is enough or not to > handle the load. But let's say you have 4 servers, that's only 60 > concurrent connections to process 300 front connections. While it can > be perfectly fine, you may need to increase the queue timeout so that > the requests can wait long enough for a slot. With a 5:1 overbooking and > your 1s queue timeout, that means you expect that the server's average > response time will not go above 200ms. That may be a bit short for some > applications, especially those sensitive to connection count. > > Thus I'd suggest that you either lower fullconn or increase minconn, and > in any case that you also increase the queue timeout to cover the worst > overbooking situation with the average server's response time. > > During the tuning phase, I'd suggest to *significantly* increase the queue > timeout so that you can observe the connection counts and even the average > response time per connection count, that will help you refine the tuning. >
Thanks for the explanation, looks like I misunderstand minconn/maxconn logic.

