Hi Willy,
I'm following up on a post I made a few weeks ago. I'm still having a
problem with certain HAproxy backends randomly (about once ever two
days) refusing to work with their assigned backend server. The only way
to bring them back to live is to restart HAproxy. At the same time,
other proxies (to other backend servers) are unaffected and continue to
work.
On 2014-01-28 18:31, Willy Tarreau wrote:
The output of "show info" and "show sess" issued to the stats socket
could be helpful. We'd see there if some connections remain there
forever, etc... Be careful, this can be long and reveal internal
information. If that's an issue, you can send them to me off-list.
There's a more detailed "show sess all" which provides a more
complete dump of the session table with flags, states, etc.
Dang, the error just occurred but I forgot to issue the show sess
command. But I still made some interesting observations this time.
In the configuration, I added the check keyword to the server. I'm using
the default check settings set by HAproxy. I tcpdumped the traffic to
the backend server after the problem showed up again today and there was
not a single bit of traffic to the specified backend server for as long
as HAproxy showed a L4TOUT. Once I restarted HAproxy, the L4 check ping
occurred every two seconds and as usual, the problem vanished thanks to
the restart. At the time when I discovered the problem (10 minutes after
it occurred according to the stats) there was not a single connection to
the specified backend server showing up with netstat.
I actually had to different proxies to the same backend server, one for
port 80 and one for 443. Both went "red" at the same time.
And another observation in the log file: once HAproxy got the L4TOUT it
started passing the requests to other proxies randomly, ignoring the
use-server keyword specified for this proxy.
For example, if this is my backend config:
use-server s.server1 if { hdr(host) -i sugardaddy1.com }
server s.server1 sugardaddy1.com:80 check
use-server s.server2 if { hdr(host) -i sugardaddy2.com }
server s.server2 sugardaddy2.com:80 check
use-server s.server3 if { hdr(host) -i sugardaddy3.com }
server s.server3 sugardaddy3.com:80 check
Once there is a timeout (although it doesn't seem to happen every time!)
all requests for sugardaddy1.com went to sugardaddy2.com or
sugardaddy3.com instead, according to HAproxy's log output.
Could it be just a configuration error eventually? Am I not supposed to
use the hdr keyword this way in the backend?
Hey and I just noticed that with he latest snapshot the L4 ping shows
the correct duration instead of the usual 2001 ms.
Roland