Thank you for the response, Lukas.

> We are running dual HAProxy machines as our load balancers for our
> > web application, with keepalived for failover.  This is the 2nd time
> > that both HAProxy instances have died in production with no indication
> > as to why.
>
> When this happens, do you always see both HAProxy instances crashing at the
> same time?


Yes, both instances at the same time.


> > The load on the servers at the time was extremely low.  We routinely do
> > MUCH MUCH more traffic than we had at the time of the last crash. Our
> > environment will stay up for months at a time without a hickup, and then
> > boom.
>
> Do you have outages on the backends or the network in between the proxy and
> the backend when this happens?
>

No, the load balancers are on the same Gigabit LAN network as the backends.
 It also happens to be on the same LAN as our db servers, and we had no db
connection error messages.  The backends never went down.


> I'm asking because you also posted:
> > HAProxy error log before crash:
> >    Nov 10 19:02:32 localhost haproxy[10926]: backend chat-cluster-01 has
> >    no server available!
>
>
> .. which is actually a major crash bug in the HAProxy releases you are
> using:
>
>
> http://haproxy.1wt.eu/git?p=haproxy.git;a=commit;h=0fc36e3ae99ccbe6de88cf64093f3045e526d088
>
>
> There is another major bug fixed in commit 506d050600, but I don't think
> you
> are running into it.
>
>
> Both bugs are fixed in HAProxy snapshot 20130707
> (haproxy-ss-20130707.tar.gz)
> and later:
>
> http://haproxy.1wt.eu/download/1.5/src/snapshot/
>
>
> I would suggest:
> - clarify if there is a connection between backend server failures and
>   HAProxy crashes
> - either backport the bugfix or upgrade to one of the snapshots containing
>   the bugfix
>
> Note that -dev17 has 2 security problems (one fixed in dev18, another one
> fixed in dev19, see http://haproxy.1wt.eu/news.html), so I would suggest
> you upgrade both HAProxy instances.
>
>
> If my guess here is wrong and HAProxy is still crashing, you need to
> enabled coredumping:
>
> http://www.mail-archive.com/[email protected]/msg09472.html



Thank you so much, this is a big help.  I will upgrade both HAProxy
instances to the latest snapshot release and report back if they crash
again.

Regards,
Sam

Reply via email to