On Thu, Oct 8, 2015 at 7:15 PM, Dmitry Sivachenko <trtrmi...@gmail.com>
wrote:

>
> > On 7 окт. 2015 г., at 16:18, Dmitry Sivachenko <trtrmi...@gmail.com>
> wrote:
> >
> > Hello,
> >
> > I am using haproxy-1.5.14 and sometimes I see the following errors in
> the log:
> >
> > Oct  7 08:33:03 srv1 haproxy[77565]: unix:1 [07/Oct/2015:08:33:02.428]
> MT-front MT_RU_EN-back/<NOSRV> 0/1000/-1/-1/1000 503 212 - - sQ--
> 125/124/108/0/0 0/28 "POST /some/url HTTP/1.1"
> > (many similar at one moment)
> >
> > Common part in these errors is "1000" in Tw and Tt, and "sQ--"
> termination state.
> >
> > Here is the relevant part on my config (I can post more if needed):
> >
> > defaults
> >    balance roundrobin
> >    maxconn 10000
> >    timeout queue 1s
> >    fullconn 3000
> >    default-server inter 5s downinter 1s fastinter 500ms fall 3 rise 1
> slowstart 60s maxqueue 1 minconn 5 maxconn 150
> >
> > backend MT_RU_EN-back
> >    mode http
> >    timeout server 30s
> >    server mt1-34 mt1-34:19016 track MT-back/mt1-34 weight 38
> >    server mt1-35 mt1-35:19016 track MT-back/mt1-35 weight 38
> >    <total 18 of similar servers>
> >
> > So this error log indicates that request was sitting in the queue for
> timeout queue==1s and his turn did not come.
> >
> > In the stats web interface for MT_RU_EN-back backend I see the following
> numbers:
> >
> > Sessions: limit=3000, max=126 (for the whole backend)
> > Limit=150, max=5 or 6 (for each server)
>
>
> I also forgot to mention the "Queue" values from stats web-interface:
> Queue max = 0 for all servers
> Queue limit = 1 for all servers (as configured in default-server)
> So according to stats queue was never used.
>
>
> Right under the servers list, there is a "Backend" line, which has the
> value of "29" in "Queue Max" column.
> What does it mean?
>
>
Well that means you had up to 29 requests in the backend queue waiting for
connection. In my case I have never seen this queue be more then 0 on the
backend or any of the backend servers for that matter. Also the queue limit
per server is 128 not 1 (I think you confuse queue limit with queue timeout
which you have set to 1 sec indeed).

So, as mentioned before, and pointed by Baptiste, your servers are not that
fast as you expect them to be, ie you have set your queues size and timeout
too low. First, is haproxy on the same LAN segment as the backend servers?
For example what is the value of the LastChk column, it should be ms
(milliseconds) if your servers are close to haproxy and not under big load.

If I were in your shoes I would:

- drop the fullconn setting and let haproxy do the math for me
- definitely increase the queue timeout to more than 1 sec (why would you
risk loosing messages, except if you are short on ram)
- set connect timeout as per the excerpt I sent previously

and see how I go.


>
> >
> > If I understand minconn/maxconn meaning right, each server should accept
> up to min(150, 3000/18) connections
> >
> > So according to stats the load were far from limits.
> >
> > What can be the cause of such errors?
> >
> > Thanks!
>
>
>

Reply via email to