Elo

Somtimes my network works fail. If my netowrk down and up after few 
seconds the ha cluster not recovery. I do not want to set up the high 
deadtime. how to fix this ?

on crm_mon
Node: storage-2 (88d7ff6f-d400-40ef-a215-8cc7a6d29072): OFFLINE
Node: storage-1 (7bce6375-3a7f-4ea1-9586-5b6f4c027190): online

ha.cf:
keepalive 1
deadtime 10

warntime 3

initdead 15
....
respawn root /usr/lib/heartbeat/pingd -m 100 -d 5s
apiauth ping gid=root uid=root



version heartbeat-2_2.1.4-7~bpo50+1_all.deb


on the log:


heartbeat[11791]: 2009/09/06_16:40:38 info: Link storage-2:eth1 up.
heartbeat[11791]: 2009/09/06_16:40:38 info: Link 10.1.131.65:10.1.131.65 up.
heartbeat[11791]: 2009/09/06_16:40:38 WARN: Late heartbeat: Node 
10.1.131.65: interval 97770 ms
heartbeat[11791]: 2009/09/06_16:40:38 info: Status update for node 
10.1.131.65: status ping
pingd[11800]: 2009/09/06_16:40:38 notice: pingd_lstatus_callback: Status 
update: Ping node storage-2 now has status [up]
pingd[11800]: 2009/09/06_16:40:38 notice: pingd_nstatus_callback: Status 
update: Ping node storage-2 now has status [up]
pingd[11800]: 2009/09/06_16:40:38 notice: pingd_lstatus_callback: Status 
update: Ping node 10.1.131.65 now has status [up]
pingd[11800]: 2009/09/06_16:40:38 notice: pingd_nstatus_callback: Status 
update: Ping node 10.1.131.65 now has status [up]
pingd[11800]: 2009/09/06_16:40:38 info: send_update: 1 active ping nodes
pingd[11800]: 2009/09/06_16:40:38 notice: pingd_nstatus_callback: Status 
update: Ping node 10.1.131.65 now has status [ping]
pingd[11800]: 2009/09/06_16:40:38 info: send_update: 1 active ping nodes
heartbeat[11791]: 2009/09/06_16:40:38 CRIT: Cluster node storage-2 
returning after partition.
heartbeat[11791]: 2009/09/06_16:40:38 info: For information on cluster 
partitions, See URL: http://linux-ha.org/SplitBrain
heartbeat[11791]: 2009/09/06_16:40:38 WARN: Deadtime value may be too small.
heartbeat[11791]: 2009/09/06_16:40:38 info: See FAQ for information on 
tuning deadtime.
heartbeat[11791]: 2009/09/06_16:40:38 info: URL: 
http://linux-ha.org/FAQ#heavy_load
heartbeat[11791]: 2009/09/06_16:40:38 WARN: Late heartbeat: Node 
storage-2: interval 97760 ms
heartbeat[11791]: 2009/09/06_16:40:38 info: Status update for node 
storage-2: status active
pingd[11800]: 2009/09/06_16:40:38 notice: pingd_nstatus_callback: Status 
update: Ping node storage-2 now has status [active]







_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to