here's a link from the searchable PLUG archive regarding
a simple Linux HA howto.

http://marc.free.net.ph/message/20030211.111335.d4ec7690.html



- can you give me more info regarding your problem?
  does the problem also occurs when the master was
  properly shutdown (or heartbeat service stop) ?


- try to specify the udpport address


udpport 694


- try to minimize the HA settings such as the following: keepalive 1 deadtime 3 initdead 6


i've experienced this kind of problem before and the culprit was the NIC driver. so i'd simply recompiled the driver then everything worked fine.

btw, are you only using a single NIC for each
machines? coz you can't effectively use
heartbeat + DRBD using this kind of setup.
DRBD will be confused on who will take
ownership of the block device and will end
up on being on "standalone" mode which will
then lead to a time-consuming "Full-Sync"
every time the master rejoins the failover
cluster.

try this setup: use ETH0 for your typical
LAN connection and as well as your heartbeat
medium. while use ETH1 as your dedicated disk
replication (and OpenMosix Cluster) network.
this setup will not only eliminate the "split-brain"
on the block device but it can also decongest your
LAN traffic and increases the replication throughput.
but you also have to add some lines to your HA configuration
such as specifying the RESPAWN IPFAIL & PING nodes to effectively
determine which of the two nodes is experiencing a network
failure in order to eliminate the HA state wherein two of
the nodes are acquiring the Virtual IP and other resource.
much better if you'll use a Non-IP heartbeat medium such as
a serial cable. if PING is blocked on your network then
you can use other technique for checking metwork connection
such as Nagios' check commands (check_ftp, check_tcp, etc.)
or SNMP then simply call heartbeat's tool "hb_standby" to
make the master node release its resources.

and one more thing, if your going to use Samba for the failover
cluster, you might as well use WINBIND to unify the user accounts
between 2 nodes and as well on the domain to ease account
management. care to add openmosix on your network? since
samba uses a different "Process" for every client connection,
that process "might be" migratable to other openmosix-enabled
nodes such as your backup server instead of wasting its computing
capability which is most of the time on idle.

HTH




AT wrote:


Care to share your howto / links that you use in setting up HA-Cluster using
heartbeat + DRBD?

Im trying to set up HA - cluster using heartbeat on RH AS 2.1

Both heartbeat service are running on primary and backup machine.
When I intentionally remove the connection of the primary machine (pull the
network cable), the virtual ip address and the hostname was taken over by
the backup machine which is as expected then I connect the primary machine
on the network, the virtual ip address and hostname was taken back by the
primary machine but the problem is that the backup machine does not release
the virual ip. So both machine have the same virtual ip address.

Anything that I miss out? or any suggestions what went wrong?

--------------------------------- *ha.cf
logfile /var/log/ha-log
keepalive 5
deadtime 10
bcast eth0
node svr01
node svr02
--------------------------------- *haresource
svr01 192.168.1.3 smb
--------------------------------- where:
svr01 - primry machine
svr02 - backup machine
192.168.1.3 - virtual ip
eth0 - heartbeat
smb - service to stop / start




thanks!




--
Philippine Linux Users' Group (PLUG) Mailing List
[EMAIL PROTECTED] (#PLUG @ irc.free.net.ph)
Official Website: http://plug.linux.org.ph
Searchable Archives: http://marc.free.net.ph
.
To leave, go to http://lists.q-linux.com/mailman/listinfo/plug
.
Are you a Linux newbie? To join the newbie list, go to
http://lists.q-linux.com/mailman/listinfo/ph-linux-newbie

Reply via email to