Hi Mike,
On Tue, Mar 08, 2016 at 10:57:21AM -0500, Mike Curry wrote:
> HAProxy is suddenly crashing on new Ubuntu (Digital Ocean, AWS - 14.04 and
> 15.10) installs. I???ve had the same configuration working for over a year
> now. I???ve posted all the logs and details below. Is there a new bug, or
> should my configuration be changed to suit the new versions?
>
> I am seeing this fail with 1.5.14, 1.5.15 - however, on older images, I am
> seeing this work with 1.5.14.
>
> http://serverfault.com/questions/762407/haproxy-suddenly-crashing-on-new-ubuntu-images-same-config-works-elsewhere
>
> <http://serverfault.com/questions/762407/haproxy-suddenly-crashing-on-new-ubuntu-images-same-config-works-elsewhere>
>
> Here is my HAProxy config:
>
> global
> daemon
> maxconn 256000
^^^^^^^^^^^^^^
This and this :
> defaults
> maxconn 256000
Imply that you'll need roughly 8.5 GB of RAM for HAProxy alone, plus
about 4 GB of RAM for the TCP stack alone, which gives 12.5 GB of RAM.
Given your kernel trace says this :
> Mar 8 10:42:05 www kernel: [ 263.005501] 262044 pages RAM
I conclude you have 1 GB of RAM in this machine. And as you can see, it's
not haproxy that is crashing but the kernel which kills it as it's the process
eating the most memory when the system runs out of memory. Thus I guess
the reason why it's suddenly crashing is that you used to be very close
to the memory limit and now you're getting more traffic and the machine
is too small to accept all the connections you have configured.
One solution to get back to a non-crashing behavious would be to reduce
the number of connections above to about 16000 to stay under reasonable
limits, but that will mean you'll start to delay some incoming connections,
which is not good.
Another short-term solution consists in reducing haproxy's buffer size,
this will divide its memory usage in almost half. For this, just add this
to the global section :
tune.bufsize 8192
tune.maxrewrite 1024
It will only last the time it takes to your visitors to increase the load.
HAProxy 1.6 supports dynamic buffer allocation, which uses a lot less memory
on idle connections. You can even enforce a hard limit on the memory allocated
by the buffers. It will still not solve the fact that the kernel side also
needs some room for the sockets it's dealing with.
A long-term solution requires to increase the memory in this machine to match
the level of load you expect to handle on this machine. If you *really* want
to support 256k concurrent connections, plan for 16 GB of RAM. If you put that
because you didn't know what to put there, well... at least now you know you
need more than about 20k.
You must absolutely check your stats periodically to ensure the load is always
within expected bounds. You may notice that sometimes the site is attacked and
since your limits are huge, they make the whole system collapse.
Willy