> So, heartbeat restart fixes the problem? That's strange. Apart from the
heartbeating, Heartbeat is just an init system. > > Once it starts resources
(whatever you put in haresources), the resources are on their own. You can
even test your setup > by shutting down Heartbeat everywhere and setting the
alias IP address by hand.
> That shouldn't make any difference whatsoever to that proxy thing.

So if I understand this now, all heartbeat does is starts and stops whatever
is in haresources? If that's the case, and all I have in it is just the one
IP address, then this is starting to sound like a Xen virtual node problem.


> > 
> > Thanks for getting back to me. I am using haresources which looks 
> > like this for LB1 & LB2:
> >    lb1.tlthost.net 192.168.31.100
> 
> There's only the IP address resource.
> 
> > I think I'm using 2.1.3-2. It's the version for Ubuntu 8.04 that's 
> > listed. I can't seem to find how to check it on my server.
> > Here are the two ha.cf files:
> > ***************** lb1 ********************* debugfile 
> > /var/log/ha-debug logfile /var/log/ha-log
> > logfacility     local0
> > keepalive 2
> > deadtime 10
> > udpport 694
> > bcast  eth0
> > mcast  eth0 225.0.0.1 694 1 0
> > ucast  eth0 192.168.31.211
> > auto_failback on
> > udp     eth0
> > node    lb1.tlthost.net
> > node    lb2.tlthost.net
> > respawn hacluster /usr/lib/heartbeat/ipfail apiauth ipfail 
> > gid=haclient uid=hacluster
> > 
> > ***************** lb2 ********************* debugfile 
> > /var/log/ha-debug logfile /var/log/ha-log
> > logfacility     local0
> > keepalive 2
> > deadtime 10
> > udpport 694
> > bcast  eth0
> > mcast  eth0 225.0.0.1 694 1 0
> > ucast  eth0 192.168.31.201
> > auto_failback on
> > udp     eth0
> > node    lb1.tlthost.net
> > node    lb2.tlthost.net
> > respawn hacluster /usr/lib/heartbeat/ipfail apiauth ipfail 
> > gid=haclient uid=hacluster
> > --------------------------------------------
> > I don't have apache running on the node that heartbeat/HAProxy is 
> > on, but I did check the syslog for anything out of place, and I 
> > couldn't find anything. The Apache log on the web server node 
> > actually shouldn't even see anything about HAProxy or heartbeat, 
> > that's why I know something is wrong when I see a apache log error 
> > "File does not exist:/var/www/apache2-default/haproxy". Since I use 
> > http://192.168.31.100/haproxy?stats to access the file, it shouldn't 
> > be looking on the web server for it. It's like whatever happens 
> > makes
> > LB1 disappear. The funny thing is it all still works as it was 
> > designed to, even when this problem is happening. The only way to 
> > make the stats work again is stop and start heartbeat. I tried doing 
> > the same for HAProxy, but it did nothing. I didn't add any logs 
> > here, because they are kind of big, even if I only include the part 
> > when it starts till it fails. If you do want to see some, please let 
> > me know which ones, and can I attach them instead of paste them?
> 
> No idea how this HAproxy thing works, sorry. At any rate, it's not 
> under control of Heartbeat. If the IP address (the only
> resource) is running where it should run (try ping and ifconfig), then 
> you'll have to talk again to the other guys.
> 
> Thanks,
> 
> Dejan
> 
> > Thanks, Tom
> > 
> > 
> > -----Original Message-----
> > From: Dejan Muhamedagic [mailto:[email protected]]
> > Sent: Monday, May 25, 2009 12:21 PM
> > To: [email protected]; General Linux-HA mailing list
> > Subject: Re: [Linux-HA] New HA user keeps loosing connection
> > 
> > Hi,
> > 
> > On Sat, May 23, 2009 at 01:13:53PM -0400, Tom Potwin wrote:
> > > Hi
> > > 
> > > I hope I'm doing this correctly. I just joined this list after I 
> > > tried looking for help with the HAProxy people.
> > > 
> > > I'm using HAProxy and Heartbeat on two Ubuntu 8.04 servers. I have 
> > > two Xen nodes on each of my physical machines. One is the load 
> > > balance and Heartbeat (LB1), the other is the actual LAMP web 
> > > server
> (WEB1).
> > > Testing HAProxy/Heartbeat setup seems that it's working fine, by 
> > > that I mean that shutting off one of the web servers, it switches 
> > > to the other one. My problem is I keep loosing access to the 
> > > HAProxy stats page. I know that isn't a huge problem, but I'm 
> > > worried it might be a sign of a bigger problem somewhere.
> > >
> > > The stats show up fine for about 15-20 minutes, then I get a 
> > > apache generic
> > > 404 error page. I also see: "File does not exist:
> > > /var/www/apache2-default/haproxy" show up in the apache error log 
> > > as soon as I loose it. If I go back to my LB1 node and restart 
> > > Heartbeat, it all comes back for another 15-20 minutes. There's 
> > > nothing in any of the logs that I can see, other than it stops 
> > > logging
> when it happens.
> > > I use http://192.168.31.100/haproxy?stats to get to that stats page. 
> > > The .100 is the shared address between the the load balancers. If 
> > > I use 192.168.31.201, which is LB1, I get the browser's 404 
> > > notice. If I use .100, it shows my apache generic 404 page. So 
> > > somehow it stops seeing LB1, and goes to port 80 on my web server on
the WEB1 node.
> > > That's where I see the apache error saying it can't find the 
> > > HAProxy stats
> > page.
> > > 
> > > When I used the "tcpdump -q -i eth0 tcp port 80 and src host
> > 192.168.31.100"
> > > command, it showed me looking at the stats, and the test web page:
> > > tcpdump: verbose output suppressed, use -v or -vv for full 
> > > protocol decode listening on eth0, link-type EN10MB (Ethernet), 
> > > capture size
> > > 96 bytes
> > > 11:23:16.106664 IP 192.168.31.100.www > 192.168.30.64.2289: tcp 0
> > > 11:23:16.254209 IP 192.168.31.100.www > 192.168.30.64.2289: tcp 0
> > > 11:23:16.254409 IP 192.168.31.100.www > 192.168.30.64.2289: tcp 
> > > 262
> > > 11:23:16.254501 IP 192.168.31.100.www > 192.168.30.64.2289: tcp 0
> > > 11:23:17.460534 IP 192.168.31.100.www > 192.168.30.64.2290: tcp 0
> > > 11:23:17.628385 IP 192.168.31.100.www > 192.168.30.64.2290: tcp 0 
> > > 11:23:17.628590 IP 192.168.31.100.www > 192.168.30.64.2290: tcp 
> > > 2712
> > > 11:23:17.839448 IP 192.168.31.100.www > 192.168.30.64.2290: tcp 
> > > 2712 11:23:17.839460 IP 192.168.31.100.www > 192.168.30.64.2290: 
> > > tcp 524
> > > 
> > > Once I couldn't see the stats page again, the output stopped 
> > > completely. I watched it on LB2 as well. It seems like it stops 
> > > listing to the .100 IP address. If I use "tcpdump -q -i eth0 tcp 
> > > port 80" I see LB1 checking web1 and web2, but nothing on the .100
> address.
> > > The HAProxy people said they thought it might be a Heartbeat 
> > > problem, because after they checked my HAProxy setup, they 
> > > couldn't find any problems there. Sorry for the long post, I'm 
> > > just getting desperate for
> > some help.
> > 
> > OK. Doubt that this is a heartbeat problem, because they typically 
> > get excercised immediately and not wait for 15 minutes to do so. 
> > Anyway, can't say more unless you provide the configuration and 
> > logs. Which heartbeat version do you use? What kind of configuration 
> > (haresources or
> v2/CRM)?
> > 
> > BTW, did you check the apache logs, i.e. is that file (a cgi script 
> > I
> > guess) really missing or is there something else. Are all processes 
> > which are supposed to be running there?
> > 
> > Thanks,
> > 
> > Dejan
> > 
> > > Thanks, Tom
> > > 
> > > _______________________________________________
> > > Linux-HA mailing list
> > > [email protected]
> > > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > > See also: http://linux-ha.org/ReportingProblems
> > 
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
> 
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to