On Mon, 06 Mar 2017 15:02:43 -0500, Willy Tarreau <[email protected]> wrote:

OK so that means that haproxy could have hung in a day or two, then your
case is much more common than one of the other reports. If your fdront LB
is fair between the 6 servers, that could be related to a total number of
requests or connections or something like this.

Another relevant point is that these servers are tied together using upstream, GeoIP-based DNS load balancing. So the request rate across servers varies quite a bit depending on the location. This would make a synchronized failure based on total requests less likely.

I'm thinking about other things :
  - if you're doing a lot of SSL we could imagine an issue with random
    generation using /dev/random instead of /dev/urandom. I've met this
    issue a long time ago on some apache servers where all the entropy
    was progressively consumed until it was not possible anymore to get
    a connection.

I'll set up a script to capture the netstat and other info prior to reloading should this issue re-occur.

As for SSL, yes, we do a fair bit of SSL ( about 30% of total request count ) and HAProxy does the TLS termination and then hands off via TCP proxy.

Best,
-=Mark S.

Reply via email to