Hi Sam,
> We are running dual HAProxy machines as our load balancers for our > web application, with keepalived for failover. This is the 2nd time > that both HAProxy instances have died in production with no indication > as to why. When this happens, do you always see both HAProxy instances crashing at the same time? > The load on the servers at the time was extremely low. We routinely do > MUCH MUCH more traffic than we had at the time of the last crash. Our > environment will stay up for months at a time without a hickup, and then > boom. Do you have outages on the backends or the network in between the proxy and the backend when this happens? I'm asking because you also posted: > HAProxy error log before crash: > Nov 10 19:02:32 localhost haproxy[10926]: backend chat-cluster-01 has > no server available! .. which is actually a major crash bug in the HAProxy releases you are using: http://haproxy.1wt.eu/git?p=haproxy.git;a=commit;h=0fc36e3ae99ccbe6de88cf64093f3045e526d088 There is another major bug fixed in commit 506d050600, but I don't think you are running into it. Both bugs are fixed in HAProxy snapshot 20130707 (haproxy-ss-20130707.tar.gz) and later: http://haproxy.1wt.eu/download/1.5/src/snapshot/ I would suggest: - clarify if there is a connection between backend server failures and HAProxy crashes - either backport the bugfix or upgrade to one of the snapshots containing the bugfix Note that -dev17 has 2 security problems (one fixed in dev18, another one fixed in dev19, see http://haproxy.1wt.eu/news.html), so I would suggest you upgrade both HAProxy instances. If my guess here is wrong and HAProxy is still crashing, you need to enabled coredumping: http://www.mail-archive.com/[email protected]/msg09472.html Regards, Lukas

