----- Original Message ----- From: "Ariz Jacinto" <[EMAIL PROTECTED]> To: "Philippine Linux Users Group Mailing List" <[EMAIL PROTECTED]> Sent: Thursday, January 15, 2004 11:41 PM Subject: Re: [plug] OpenMosix HPC + HA Cluster (heartbeat + DRBD)
> here's a link from the searchable PLUG archive regarding > a simple Linux HA howto. > > http://marc.free.net.ph/message/20030211.111335.d4ec7690.html > > > > - can you give me more info regarding your problem? > does the problem also occurs when the master was > properly shutdown (or heartbeat service stop) ? > > > - try to specify the udpport address > > udpport 694 > > I'll try this one... > - try to minimize the HA settings such > as the following: > > keepalive 1 > deadtime 3 > initdead 6 > > > i've experienced this kind of problem before > and the culprit was the NIC driver. so i'd > simply recompiled the driver then everything > worked fine. > NIC driver problem? hmm... the NIC brand was BMC. > btw, are you only using a single NIC for each > machines? coz you can't effectively use > heartbeat + DRBD using this kind of setup. > DRBD will be confused on who will take > ownership of the block device and will end > up on being on "standalone" mode which will > then lead to a time-consuming "Full-Sync" > every time the master rejoins the failover > cluster. > I'm using 2 NICs, for now Im not trying to implement DRBD. I want to make my heartbeat work first then i'll try to add services such as samba and do a file server failover testng. > try this setup: use ETH0 for your typical > LAN connection and as well as your heartbeat > medium. while use ETH1 as your dedicated disk > replication (and OpenMosix Cluster) network. > this setup will not only eliminate the "split-brain" > on the block device but it can also decongest your > LAN traffic and increases the replication throughput. > but you also have to add some lines to your HA configuration > such as specifying the RESPAWN IPFAIL & PING nodes to effectively > determine which of the two nodes is experiencing a network > failure in order to eliminate the HA state wherein two of > the nodes are acquiring the Virtual IP and other resource. > much better if you'll use a Non-IP heartbeat medium such as > a serial cable. if PING is blocked on your network then > you can use other technique for checking metwork connection > such as Nagios' check commands (check_ftp, check_tcp, etc.) > or SNMP then simply call heartbeat's tool "hb_standby" to > make the master node release its resources. i define ETH) as my LAN connection and ETH1 as my dedicated heartbeat, failover doesnt takes place - nothing happens if i stop the heartbeat service on my primary cluster or if I yank the cable of my primary cluster. I use ETH0 intead as my LAN and heartbeat connection and failover works once I yank the primary server or when I stop the heartbeat service. My problem is that when I connect the cable of my Primary server, the primary server took over the virtual IP but my secondary server did not release the virtual IP. Both servers have the virtual IP. i'll try the configurations you advice and hope everything works fine - thanks... > > and one more thing, if your going to use Samba for the failover > cluster, you might as well use WINBIND to unify the user accounts > between 2 nodes and as well on the domain to ease account > management. care to add openmosix on your network? since > samba uses a different "Process" for every client connection, > that process "might be" migratable to other openmosix-enabled > nodes such as your backup server instead of wasting its computing > capability which is most of the time on idle. > > HTH > > > > > AT wrote: > > > Care to share your howto / links that you use in setting up HA-Cluster using > > heartbeat + DRBD? > > > > Im trying to set up HA - cluster using heartbeat on RH AS 2.1 > > > > Both heartbeat service are running on primary and backup machine. > > When I intentionally remove the connection of the primary machine (pull the > > network cable), the virtual ip address and the hostname was taken over by > > the backup machine which is as expected then I connect the primary machine > > on the network, the virtual ip address and hostname was taken back by the > > primary machine but the problem is that the backup machine does not release > > the virual ip. So both machine have the same virtual ip address. > > > > Anything that I miss out? or any suggestions what went wrong? > > > > --------------------------------- > > *ha.cf > > logfile /var/log/ha-log > > keepalive 5 > > deadtime 10 > > bcast eth0 > > node svr01 > > node svr02 > > --------------------------------- > > *haresource > > svr01 192.168.1.3 smb > > --------------------------------- > > where: > > svr01 - primry machine > > svr02 - backup machine > > 192.168.1.3 - virtual ip > > eth0 - heartbeat > > smb - service to stop / start > > > > > > > > thanks! > > > > > > > -- > Philippine Linux Users' Group (PLUG) Mailing List > [EMAIL PROTECTED] (#PLUG @ irc.free.net.ph) > Official Website: http://plug.linux.org.ph > Searchable Archives: http://marc.free.net.ph > . > To leave, go to http://lists.q-linux.com/mailman/listinfo/plug > . > Are you a Linux newbie? To join the newbie list, go to > http://lists.q-linux.com/mailman/listinfo/ph-linux-newbie > -- Philippine Linux Users' Group (PLUG) Mailing List [EMAIL PROTECTED] (#PLUG @ irc.free.net.ph) Official Website: http://plug.linux.org.ph Searchable Archives: http://marc.free.net.ph . To leave, go to http://lists.q-linux.com/mailman/listinfo/plug . Are you a Linux newbie? To join the newbie list, go to http://lists.q-linux.com/mailman/listinfo/ph-linux-newbie
