Hi I use http://192.168.31.100/haproxy?stats to get to that stats page. The .100 is the shared address between the load balancers. If I use .201, which is LB1, I get the browser's 404 notice. If I use .100, it shows my apache generic 404 page. So somehow it stops seeing LB1, and goes to port 80 on my web server on the web1 node. That's where I see the apache error saying it can't find the HAProxy stats page. I never use the domain name I gave the server. I don't have any DNS entries for LB1 or LB2 because they are only accessed locally - unless that's wrong. All I have on the LB nodes is Ubuntu server 8.04, HAProxy, Heartbeat, and a few support programs. I have never used tcpdump before, so I had to do some quick research first. I reset the node again to make things work. When I used the "tcpdump -q -i eth0 tcp port 80 and src host 192.168.31.100" command, it showed me looking at the stats, and the test web page: tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes 11:23:16.106664 IP 192.168.31.100.www > 192.168.30.64.2289: tcp 0 11:23:16.254209 IP 192.168.31.100.www > 192.168.30.64.2289: tcp 0 11:23:16.254409 IP 192.168.31.100.www > 192.168.30.64.2289: tcp 262 11:23:16.254501 IP 192.168.31.100.www > 192.168.30.64.2289: tcp 0 11:23:17.460534 IP 192.168.31.100.www > 192.168.30.64.2290: tcp 0 11:23:17.628385 IP 192.168.31.100.www > 192.168.30.64.2290: tcp 0 11:23:17.628590 IP 192.168.31.100.www > 192.168.30.64.2290: tcp 2712 11:23:17.839448 IP 192.168.31.100.www > 192.168.30.64.2290: tcp 2712 11:23:17.839460 IP 192.168.31.100.www > 192.168.30.64.2290: tcp 524
Once I couldn't see the stats page again, the output stopped. I watched it on LB2 as well. It seems like it stops listing to the .100 IP address. If I use "tcpdump -q -i eth0 tcp port 80" I see LB1 checking web1 and web2, but nothing on the .100 address. If I was running a nameserver, I'd say that's the problem, but I'm not. Tom -----Original Message----- From: Willy Tarreau [mailto:[email protected]] Sent: Saturday, May 16, 2009 10:06 AM To: Tom Potwin Cc: [email protected] Subject: Re: New HAProxy user keeps loosing connection On Sat, May 16, 2009 at 09:44:53AM -0400, Tom Potwin wrote: > Hi Willy > > I checked the cfg files for both HAProxy and heartbeat, and they're > the same where they are supposed to be. You we're right about the SYSLOGD="-r" > setting. I didn't know I had to do that. I've attached a new copy of > the haproxy.log file I started when I restarted both LB1 and LB2 > servers. It shows from the time it works at 08:54, till it stops at > 09:14. The log is from LB1, and the same log on LB2 shows "May 16 09:14:58 lb2 -- MARK --" > when I couldn't get back into the stats. I still can get to my test > local web site even after the stats go away. But there is no log after 9:11. Are you really sure that your browser is still going to the load-balancer's IP address ? Do you go there with its IP address or domain name ? Maybe you have multiple "A" records for the same host and your browser is rotating between them ? Could you also try connecting to LB1's own IP address instead of the shared IP address, so that at least we get a clue whether it's caused by heartbeat doing strange things or something else ? Right now I would say that some of your traffic does not even reach the load balancer :-/ You could even install tcpdump on LB1 and check for your incoming requests. Willy

