Hello,

I have a problem using DRBD 8.0.5 and Heartbeat 1.2.5 in an Active/ Passive schema. When a large amount of data has to be sync between 2 nodes (for instance when I restart the crashed primary node), it takes a lot of time to resync. The problem is, once the boot process is completed, heartbeat triggers a takeover after the initdead delay (which makes sense because this is the "preferred" primary node and auto_failback is on) even if the DRBD device is still syncing despite of the 120 sec. delay (initdead setting).

I don't understand why heartbeat doesn't wait for the initdead delay before taking over. Obviously, this behaviour leads to software crashed because their data aren't available.


This is my conf files:


##ha.cf (on primary):
debugfile /var/log/ha-debug
logfile /var/log/ha-log
logfacility     local0
keepalive 1
deadtime 5
warntime 3
initdead 120
udpport 694
ucast eth0 192.168.0.1
auto_failback on
node    Primary
node    Secondary
ping 10.X.X.X
respawn hacluster /usr/lib/heartbeat/ipfail
deadping 90
realtime on
debug 1
##

##ha.cf (on secondary):
logfile /var/log/ha-log
logfacility     local0
keepalive 1
deadtime 5
warntime 3
initdead 120
udpport 694
ucast eth0 192.168.0.2
auto_failback on
node    Primary
node    Secondary
ping 10.X.X.X
respawn hacluster /usr/lib/heartbeat/ipfail
deadping 90
realtime on
debug 1
##


##haresources.d (on primary)
Primary 10.0.254.254 drbddisk::data Filesystem::/dev/drbd0::/ data::ext3 MailTo::r...@localhost::Cluster monit-Primary
##

##haresources.d (on secondary)
Primary 10.0.254.254 drbddisk::data Filesystem::/dev/drbd0::/ data::ext3 MailTo::r...@localhost::Cluster monit-Secondary
##

(monit-Primary and monit-Secondary are the same file, just the name is different).


What did i miss ?

Thanks,
Vianney
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to