Hello again list,

 I have a little more info to add..

 I was able to start up both lb's in debug mode. And I found some interesting 
info.. on lb1 (the functioning node) I see activity in the debug logs as I 
access the sites. But in the debug logs of lb2 this is all I see:


[root@VIRTCENT02:~] #haproxy -d -f /etc/haproxy/haproxy.cfg
Available polling systems :
     sepoll : pref=400,  test result OK
       poll : pref=200,  test result OK
     select : pref=150,  test result OK
      epoll : disabled,  test result OK
Total: 4 (3 usable), will use sepoll.
Using sepoll() as the polling mechanism.
00000001:www.accept(0004)=0006 from [192.168.1.34:46634]
00000001:www.clireq[0006:ffff]: GET /admin?stats;csv HTTP/1.1
00000001:www.clihdr[0006:ffff]: TE: deflate,gzip;q=0.3
00000001:www.clihdr[0006:ffff]: Connection: TE, close
00000001:www.clihdr[0006:ffff]: Host: 192.168.1.200
00000001:www.clihdr[0006:ffff]: User-Agent: check_haproxy.pl
00000001:www.srvcls[0006:ffff]
00000001:www.clicls[0006:ffff]
00000001:www.closed[0006:ffff]


What you see here is the nagios server checking for a CSV file to indicate that 
the server is alive. And the nagios check is successful and reports the site is 
alive. But the sites will not appear in any browser. 

If I fire up lb1 the sites start to work and I see this in the debug logs:


[root@VIRTCENT01:~] #haproxy -f /etc/haproxy/haproxy.cfg -d
Available polling systems :
     sepoll : pref=400,  test result OK
       poll : pref=200,  test result OK
     select : pref=150,  test result OK
      epoll : disabled,  test result OK
Total: 4 (3 usable), will use sepoll.
Using sepoll() as the polling mechanism.
00000000:www.accept(0004)=0006 from [71.187.226.165:1024]
00000000:www.clireq[0006:ffff]: GET /cake/ HTTP/1.1
00000000:www.clihdr[0006:ffff]: Host: stage.jokefire.com
00000000:www.clihdr[0006:ffff]: User-Agent: Mozilla/5.0 (Macintosh; Intel Mac 
OS X 10.6; rv:7.0.1) Gecko/20100101 Firefox/7.0.1
00000000:www.clihdr[0006:ffff]: Accept: 
text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
00000000:www.clihdr[0006:ffff]: Accept-Language: en-us,en;q=0.5
00000000:www.clihdr[0006:ffff]: Accept-Encoding: gzip, deflate
00000000:www.clihdr[0006:ffff]: Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
00000000:www.clihdr[0006:ffff]: Connection: keep-alive
00000000:www.clihdr[0006:ffff]: Cookie: CAKEPHP=l8ug7fl47khnhvhjmcgtc3kcu2; 
SERVERID=B
00000000:www.clihdr[0006:ffff]: Cache-Control: max-age=0
00000000:app.srvrep[0006:0007]: HTTP/1.1 200 OK
00000000:app.srvhdr[0006:0007]: Date: Sat, 15 Oct 2011 18:06:20 GMT
00000000:app.srvhdr[0006:0007]: Server: Apache/2.2.20 (CentOS)
00000000:app.srvhdr[0006:0007]: X-Powered-By: PHP/5.3.6
00000000:app.srvhdr[0006:0007]: P3P: CP="NOI ADM DEV PSAi COM NAV OUR OTRo STP 
IND DEM"
00000000:app.srvhdr[0006:0007]: Content-Length: 4937
00000000:app.srvhdr[0006:0007]: Connection: close
00000000:app.srvhdr[0006:0007]: Content-Type: text/html; charset=UTF-8
00000000:app.srvcls[0006:0007]
00000000:app.clicls[0006:0007]
00000000:app.closed[0006:0007]
00000001:www.accept(0004)=0006 from [71.187.226.165:1025]
00000001:www.clireq[0006:ffff]: GET /cake/app/webroot/css/cake.generic.css 
HTTP/1.1
00000001:www.clihdr[0006:ffff]: Host: stage.jokefire.com
00000001:www.clihdr[0006:ffff]: User-Agent: Mozilla/5.0 (Macintosh; Intel Mac 
OS X 10.6; rv:7.0.1) Gecko/20100101 Firefox/7.0.1
00000001:www.clihdr[0006:ffff]: Accept: text/css,*/*;q=0.1
00000001:www.clihdr[0006:ffff]: Accept-Language: en-us,en;q=0.5
00000001:www.clihdr[0006:ffff]: Accept-Encoding: gzip, deflate
00000001:www.clihdr[0006:ffff]: Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7


Thanks once again for any insight you may have to share!

Tim




----- Original Message -----
From: "Tim Dunphy" <[email protected]>
To: [email protected]
Sent: Saturday, October 15, 2011 12:07:54 PM
Subject: simple failover is failing

Hello List,

 

  I have a very simple HAProxy configuration that is balancing two web servers. 
This configuration was failing over from node 1 to node 2, and from node 2 to 
node 1.. but now the only node that displays the web sites is node 1. If node 1 
is stopped and node 2 is the only load balancer running going to the urls that 
worked under node 1 displays page not found.

 This is a little puzzling because the configurations between the two nodes is 
identical. The only difference between the two configuration files are the node 
and description entries. 

## lb1 haproxy config -- this load balancer works - it shows the sites

global    
      log 127.0.0.1   local0 
      log 127.0.0.1   local1 notice
      maxconn         384 
      user  haproxy
      group haproxy
      noepoll      
      daemon
      node lb1
      description jokefire lb 1 
      spread-checks 5
 
defaults
      log     global
      mode    http
      option  httplog
      option  httpchk
      option  httpclose
      option  forwardfor
      option  redispatch
      retries 3
      contimeout      50000
      clitimeout      5000000
      srvtimeout      5000000
      stats uri /admin?stats
      #stats auth bluethundr:secret 
      stats refresh 5s

frontend www 192.168.1.200:80
log  global
default_backend app


backend app
log global
balance roundrobin
stats enable
cookie SERVERID insert indirect
option httpchk HEAD /check.txt HTTP/1.0
server web1 web1.summitnjhome.com:80 cookie A check maxconn 128 
server web2 web2.summitnjhome.com:80 cookie B check maxconn 128


## lb2 haproxy config - this load balacer does not -- sites are page not found!

global    
      log 127.0.0.1   local0 
      log 127.0.0.1   local1 notice
      maxconn         384 
      user  haproxy
      group haproxy
      noepoll      
      daemon
      node lb2
      description jokefire lb 1 
      spread-checks 5
 
defaults
      log     global
      mode    http
      option  httplog
      option  httpchk
      option  httpclose
      option  forwardfor
      option  redispatch
      retries 3
      contimeout      50000
      clitimeout      5000000
      srvtimeout      5000000
      stats uri /admin?stats
      #stats auth bluethundr:secret 
      stats refresh 5s

frontend www 192.168.1.200:80
log  global
default_backend app


backend app
log global
balance roundrobin
stats enable
cookie SERVERID insert indirect
option httpchk HEAD /check.txt HTTP/1.0
server web1 web1.summitnjhome.com:80 cookie A check maxconn 128 
server web2 web2.summitnjhome.com:80 cookie B check maxconn 128


## machine info

haproxy-1.3.25-1
CentOS release 5.7 (Final)
i686


Heartbeat is being provided by keepalived but that appears to be functioning 
well. 


Well this is a slightly embarrassing situation but I greatly appreciate any 
help you may have to offer. 

Thanks in advance!
Tim

Reply via email to