Hello,
I have a problem using DRBD 8.0.5 and Heartbeat 1.2.5 in an Active/
Passive schema. When a large amount of data has to be sync between 2
nodes (for instance when I restart the crashed primary node), it takes
a lot of time to resync. The problem is, once the boot process is
completed, heartbeat triggers a takeover after the initdead delay
(which makes sense because this is the "preferred" primary node and
auto_failback is on) even if the DRBD device is still syncing despite
of the 120 sec. delay (initdead setting).
I don't understand why heartbeat doesn't wait for the initdead delay
before taking over. Obviously, this behaviour leads to software
crashed because their data aren't available.
This is my conf files:
##ha.cf (on primary):
debugfile /var/log/ha-debug
logfile /var/log/ha-log
logfacility local0
keepalive 1
deadtime 5
warntime 3
initdead 120
udpport 694
ucast eth0 192.168.0.1
auto_failback on
node Primary
node Secondary
ping 10.X.X.X
respawn hacluster /usr/lib/heartbeat/ipfail
deadping 90
realtime on
debug 1
##
##ha.cf (on secondary):
logfile /var/log/ha-log
logfacility local0
keepalive 1
deadtime 5
warntime 3
initdead 120
udpport 694
ucast eth0 192.168.0.2
auto_failback on
node Primary
node Secondary
ping 10.X.X.X
respawn hacluster /usr/lib/heartbeat/ipfail
deadping 90
realtime on
debug 1
##
##haresources.d (on primary)
Primary 10.0.254.254 drbddisk::data Filesystem::/dev/drbd0::/
data::ext3 MailTo::r...@localhost::Cluster monit-Primary
##
##haresources.d (on secondary)
Primary 10.0.254.254 drbddisk::data Filesystem::/dev/drbd0::/
data::ext3 MailTo::r...@localhost::Cluster monit-Secondary
##
(monit-Primary and monit-Secondary are the same file, just the name is
different).
What did i miss ?
Thanks,
Vianney
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems