Hi all.

I've got a 2-node cluster set up using Heartbeat, CRM and Ldirectord.

Most of the time it works great but randomly, resources won't fail over
to the other node. When this happens I can't shut Heartbeat down, I have
to kill -9 the heartbeat master control process then I can restart it. I
have to do this a lot just to restart heartbeat anyway.

It seems to be if I leave the server for a few hours that the problem
occurs. If I test it over and over then it's fine. 

When the problem occurs this is what it logs on the node still up:

Jan 21 15:58:52 ogg-dvla-02 crmd: [3789]: notice: crmd_client_status_callback: 
Status update: Client ogg-dvla-01/crmd now has status [offline]
Jan 21 15:58:52 ogg-dvla-02 crmd: [3789]: info: mem_handle_event: Got an event 
OC_EV_MS_NOT_PRIMARY from ccm
Jan 21 15:58:52 ogg-dvla-02 crmd: [3789]: info: mem_handle_event: instance=61, 
nodes=2, new=1, lost=0, n_idx=0, new_idx=2, old_idx=4
Jan 21 15:58:52 ogg-dvla-02 crmd: [3789]: info: crmd_ccm_msg_callback: Quorum 
lost after event=NOT PRIMARY (id=61)
Jan 21 15:58:52 ogg-dvla-02 cib: [3785]: info: mem_handle_event: Got an event 
OC_EV_MS_NOT_PRIMARY from ccm
Jan 21 15:58:52 ogg-dvla-02 cib: [3785]: info: mem_handle_event: instance=61, 
nodes=2, new=1, lost=0, n_idx=0, new_idx=2, old_idx=4
Jan 21 15:58:52 ogg-dvla-02 cib: [3785]: info: apply_xml_diff: Digest 
mis-match: expected 3e6e45302914a4cebbbd69f9dacc7426, calculated 
88d796cb9934a600c01584cc89659117
Jan 21 15:58:52 ogg-dvla-02 cib: [3785]: info: cib_process_diff: Diff 0.61.28 
-> 0.61.29 not applied to 0.61.28: Failed application of a global update.  
Requesting full refresh.
Jan 21 15:58:52 ogg-dvla-02 cib: [3785]: info: cib_process_diff: Requesting 
re-sync from peer: Failed application of a global update.  Requesting full 
refresh.
Jan 21 15:58:52 ogg-dvla-02 cib: [3785]: WARN: do_cib_notify: cib_apply_diff of 
<diff > FAILED: Application of an update diff failed, requesting a full refresh
Jan 21 15:58:52 ogg-dvla-02 cib: [3785]: WARN: cib_process_request: 
cib_apply_diff operation failed: Application of an update diff failed, 
requesting a full refresh
Jan 21 15:58:52 ogg-dvla-02 cib: [3785]: WARN: cib_process_diff: Not applying 
diff 0.61.29 -> 0.61.30 (sync in progress)
Jan 21 15:58:52 ogg-dvla-02 cib: [3785]: WARN: do_cib_notify: cib_apply_diff of 
<diff > FAILED: Application of an update diff failed, requesting a full refresh
Jan 21 15:58:52 ogg-dvla-02 cib: [3785]: WARN: cib_process_request: 
cib_apply_diff operation failed: Application of an update diff failed, 
requesting a full refresh
Jan 21 15:58:52 ogg-dvla-02 cib: [3785]: WARN: cib_process_diff: Not applying 
diff 0.61.30 -> 0.61.31 (sync in progress)
Jan 21 15:58:52 ogg-dvla-02 cib: [3785]: WARN: do_cib_notify: cib_apply_diff of 
<diff > FAILED: Application of an update diff failed, requesting a full refresh
Jan 21 15:58:52 ogg-dvla-02 cib: [3785]: WARN: cib_process_request: 
cib_apply_diff operation failed: Application of an update diff failed, 
requesting a full refresh
Jan 21 15:58:52 ogg-dvla-02 cib: [3785]: WARN: cib_process_diff: Not applying 
diff 0.61.31 -> 0.61.32 (sync in progress)
Jan 21 15:58:52 ogg-dvla-02 cib: [3785]: WARN: do_cib_notify: cib_apply_diff of 
<diff > FAILED: Application of an update diff failed, requesting a full refresh
Jan 21 15:58:52 ogg-dvla-02 cib: [3785]: WARN: cib_process_request: 
cib_apply_diff operation failed: Application of an update diff failed, 
requesting a full refresh
Jan 21 15:58:52 ogg-dvla-02 cib: [3785]: WARN: cib_process_diff: Not applying 
diff 0.61.32 -> 0.61.33 (sync in progress)
Jan 21 15:58:52 ogg-dvla-02 cib: [3785]: WARN: do_cib_notify: cib_apply_diff of 
<diff > FAILED: Application of an update diff failed, requesting a full refresh
Jan 21 15:58:52 ogg-dvla-02 cib: [3785]: WARN: cib_process_request: 
cib_apply_diff operation failed: Application of an update diff failed, 
requesting a full refresh
Jan 21 15:58:52 ogg-dvla-02 cib: [3785]: info: cib_client_status_callback: 
Status update: Client ogg-dvla-01/cib now has status [leave]
Jan 21 15:58:57 ogg-dvla-02 ccm: [3784]: debug: quorum plugin: majority
Jan 21 15:58:57 ogg-dvla-02 ccm: [3784]: debug: cluster:linux-ha, 
member_count=1, member_quorum_votes=100
Jan 21 15:58:57 ogg-dvla-02 crmd: [3789]: info: mem_handle_event: Got an event 
OC_EV_MS_INVALID from ccm
Jan 21 15:58:57 ogg-dvla-02 cib: [3785]: info: mem_handle_event: Got an event 
OC_EV_MS_INVALID from ccm
Jan 21 15:58:57 ogg-dvla-02 ccm: [3784]: debug: total_node_count=2, 
total_quorum_votes=200
Jan 21 15:58:57 ogg-dvla-02 crmd: [3789]: info: mem_handle_event: no mbr_track 
info
Jan 21 15:58:57 ogg-dvla-02 cib: [3785]: info: mem_handle_event: no mbr_track 
info
Jan 21 15:58:57 ogg-dvla-02 ccm: [3784]: debug: quorum plugin: twonodes
Jan 21 15:58:57 ogg-dvla-02 cib: [3785]: info: mem_handle_event: Got an event 
OC_EV_MS_NEW_MEMBERSHIP from ccm
Jan 21 15:58:57 ogg-dvla-02 ccm: [3784]: debug: cluster:linux-ha, 
member_count=1, member_quorum_votes=100
Jan 21 15:58:57 ogg-dvla-02 cib: [3785]: info: mem_handle_event: instance=62, 
nodes=1, new=0, lost=1, n_idx=0, new_idx=1, old_idx=3
Jan 21 15:58:57 ogg-dvla-02 ccm: [3784]: debug: total_node_count=2, 
total_quorum_votes=200
Jan 21 15:58:57 ogg-dvla-02 crmd: [3789]: info: mem_handle_event: Got an event 
OC_EV_MS_NEW_MEMBERSHIP from ccm
Jan 21 15:58:57 ogg-dvla-02 cib: [3785]: info: cib_ccm_msg_callback: LOST: 
ogg-dvla-01
Jan 21 15:58:57 ogg-dvla-02 ccm: [3784]: info: Break tie for 2 nodes cluster
Jan 21 15:58:57 ogg-dvla-02 crmd: [3789]: info: mem_handle_event: instance=62, 
nodes=1, new=0, lost=1, n_idx=0, new_idx=1, old_idx=3
Jan 21 15:58:57 ogg-dvla-02 cib: [3785]: info: cib_ccm_msg_callback: PEER: 
ogg-dvla-02
Jan 21 15:58:57 ogg-dvla-02 crmd: [3789]: info: crmd_ccm_msg_callback: Quorum 
(re)attained after event=NEW MEMBERSHIP (id=62)
Jan 21 15:58:57 ogg-dvla-02 crmd: [3789]: info: ccm_event_detail: NEW 
MEMBERSHIP: trans=62, nodes=1, new=0, lost=1 n_idx=0, new_idx=1, old_idx=3
Jan 21 15:58:57 ogg-dvla-02 crmd: [3789]: info: ccm_event_detail:       
CURRENT: ogg-dvla-02 [nodeid=1, born=62]
Jan 21 15:58:57 ogg-dvla-02 crmd: [3789]: info: ccm_event_detail:       LOST:   
 ogg-dvla-01 [nodeid=0, born=1]
Jan 21 15:59:23 ogg-dvla-02 heartbeat: [3275]: WARN: node ogg-dvla-01: is dead
Jan 21 15:59:23 ogg-dvla-02 heartbeat: [3275]: info: Link ogg-dvla-01:eth0 dead.
Jan 21 15:59:23 ogg-dvla-02 crmd: [3789]: notice: crmd_ha_status_callback: 
Status update: Node ogg-dvla-01 now has status [dead]


And this is what it logs when it fails over OK:

Jan 21 16:29:31 ogg-dvla-02 cib: [3785]: info: cib_stats: Processed 39 
operations (1794.00us average, 0% utilization) in the last 10min                
                             
Jan 21 16:32:04 ogg-dvla-02 crmd: [3789]: info: handle_shutdown_request: 
Creating shutdown request for ogg-dvla-01                                       
                           
Jan 21 16:32:04 ogg-dvla-02 tengine: [7204]: info: extract_event: Aborting on 
shutdown attribute for 74195e76-f72c-45a2-aba5-07a0574c4058                     
                      
Jan 21 16:32:04 ogg-dvla-02 crmd: [3789]: info: do_state_transition: State 
transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_IPC_MESSAGE 
origin=route_message ]        
Jan 21 16:32:04 ogg-dvla-02 tengine: [7204]: info: update_abort_priority: Abort 
priority upgraded to 1000000                                                    
                    
Jan 21 16:32:04 ogg-dvla-02 crmd: [3789]: info: do_state_transition: All 2 
cluster nodes are eligible to run resources.                                    
                         
Jan 21 16:32:04 ogg-dvla-02 pengine: [7205]: info: determine_online_status: 
Node ogg-dvla-01 is shutting down                                               
                        
Jan 21 16:32:04 ogg-dvla-02 pengine: [7205]: info: determine_online_status: 
Node ogg-dvla-02 is online                                                      
                        
Jan 21 16:32:04 ogg-dvla-02 pengine: [7205]: notice: group_print: Resource 
Group: load_balancer                                                            
                         
Jan 21 16:32:04 ogg-dvla-02 pengine: [7205]: notice: native_print:     vip      
(ocf::heartbeat:IPaddr2):       Started ogg-dvla-01                             
                    
Jan 21 16:32:04 ogg-dvla-02 pengine: [7205]: notice: native_print:     
ldirector        (ocf::heartbeat:ldirectord):    Started ogg-dvla-01            
                             
Jan 21 16:32:04 ogg-dvla-02 pengine: [7205]: notice: NoRoleChange: Move  
resource vip   (ogg-dvla-01 -> ogg-dvla-02)                                     
                           
Jan 21 16:32:04 ogg-dvla-02 pengine: [7205]: notice: StopRsc:   ogg-dvla-01     
Stop vip                                                                        
                    
Jan 21 16:32:04 ogg-dvla-02 pengine: [7205]: notice: StartRsc:  ogg-dvla-02     
Start vip                                                                       
                    
Jan 21 16:32:04 ogg-dvla-02 pengine: [7205]: notice: RecurringOp: ogg-dvla-02   
   vip_monitor_20000                                                            
                    
Jan 21 16:32:04 ogg-dvla-02 pengine: [7205]: notice: NoRoleChange: Move  
resource ldirector     (ogg-dvla-01 -> ogg-dvla-02)                             
                           
Jan 21 16:32:04 ogg-dvla-02 pengine: [7205]: notice: StopRsc:   ogg-dvla-01     
Stop ldirector                                                                  
                    
Jan 21 16:32:04 ogg-dvla-02 pengine: [7205]: notice: StartRsc:  ogg-dvla-02     
Start ldirector                                                                 
                    
Jan 21 16:32:04 ogg-dvla-02 pengine: [7205]: notice: RecurringOp: ogg-dvla-02   
   ldirector_monitor_20000                                                      
                    
Jan 21 16:32:04 ogg-dvla-02 pengine: [7205]: info: stage6: Scheduling Node 
ogg-dvla-01 for shutdown                                                        
                         
Jan 21 16:32:04 ogg-dvla-02 crmd: [3789]: info: do_state_transition: State 
transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS 
cause=C_IPC_MESSAGE origin=route_message ]                                      
                                                                                
                                                      
Jan 21 16:32:04 ogg-dvla-02 tengine: [7204]: info: unpack_graph: Unpacked 
transition 11: 12 actions in 12 synapses                                        
                          
Jan 21 16:32:04 ogg-dvla-02 tengine: [7204]: info: te_pseudo_action: Pseudo 
action 15 fired and confirmed                                                   
                        
Jan 21 16:32:04 ogg-dvla-02 tengine: [7204]: info: send_rsc_command: Initiating 
action 10: ldirector_stop_0 on ogg-dvla-01                                      
                    
Jan 21 16:32:04 ogg-dvla-02 pengine: [7205]: info: process_pe_message: 
Transition 11: PEngine Input stored in: 
/var/lib/heartbeat/pengine/pe-input-1809.bz2                         
Jan 21 16:32:05 ogg-dvla-02 tengine: [7204]: info: match_graph_event: Action 
ldirector_stop_0 (10) confirmed on ogg-dvla-01 (rc=0)                           
                       
Jan 21 16:32:05 ogg-dvla-02 tengine: [7204]: info: send_rsc_command: Initiating 
action 7: vip_stop_0 on ogg-dvla-01                                             
                    
Jan 21 16:32:06 ogg-dvla-02 tengine: [7204]: info: match_graph_event: Action 
vip_stop_0 (7) confirmed on ogg-dvla-01 (rc=0)                                  
                       
Jan 21 16:32:06 ogg-dvla-02 tengine: [7204]: info: te_pseudo_action: Pseudo 
action 16 fired and confirmed                                                   
                        
Jan 21 16:32:06 ogg-dvla-02 tengine: [7204]: info: te_pseudo_action: Pseudo 
action 3 fired and confirmed                                                    
                        
Jan 21 16:32:06 ogg-dvla-02 tengine: [7204]: info: te_crm_command: Executing 
crm-event (18): do_shutdown on ogg-dvla-01                                      
                       
Jan 21 16:32:06 ogg-dvla-02 crmd: [3789]: info: do_lrm_rsc_op: Performing 
op=vip_start_0 key=8:11:4f1659a1-c670-4bfc-972a-341fd26c1aba)                   
                          
Jan 21 16:32:06 ogg-dvla-02 tengine: [7204]: info: te_pseudo_action: Pseudo 
action 13 fired and confirmed                                                   
                        
Jan 21 16:32:06 ogg-dvla-02 lrmd: [3786]: info: rsc:vip: start                  
                                                                                
                    
Jan 21 16:32:06 ogg-dvla-02 tengine: [7204]: info: send_rsc_command: Initiating 
action 8: vip_start_0 on ogg-dvla-02                                            
                    
Jan 21 16:32:06 ogg-dvla-02 IPaddr2[18852]: [18887]: INFO: Removing conflicting 
loopback lo.                                                                    
                    
Jan 21 16:32:06 ogg-dvla-02 IPaddr2[18852]: [18888]: INFO: ip -f inet addr 
delete 10.167.30.76/32 dev lo                                                   
                         
Jan 21 16:32:06 ogg-dvla-02 IPaddr2[18852]: [18890]: INFO: ip -o -f inet addr 
show lo                                                                         
                      
Jan 21 16:32:06 ogg-dvla-02 IPaddr2[18852]: [18892]: INFO: ip route delete 
10.167.30.76 dev lo                                                             
                         
Jan 21 16:32:06 ogg-dvla-02 lrmd: [3786]: info: RA output: (vip:start:stderr) 
RTNETLINK answers: No such process                                              
                      
Jan 21 16:32:06 ogg-dvla-02 IPaddr2[18852]: [18894]: INFO: ip -f inet addr add 
10.167.30.76/25 brd 10.167.30.127 dev eth0
Jan 21 16:32:06 ogg-dvla-02 IPaddr2[18852]: [18896]: INFO: ip link set eth0 up
Jan 21 16:32:06 ogg-dvla-02 IPaddr2[18852]: [18898]: INFO: 
/usr/lib/heartbeat/send_arp -i 200 -r 5 -p 
/var/run/heartbeat/rsctmp/send_arp/send_arp-10.167.30.76 eth0 10.167.30.76 auto 
not_used not_used
Jan 21 16:32:06 ogg-dvla-02 crmd: [3789]: info: process_lrm_event: LRM 
operation vip_start_0 (call=22, rc=0) complete
Jan 21 16:32:06 ogg-dvla-02 tengine: [7204]: info: match_graph_event: Action 
vip_start_0 (8) confirmed on ogg-dvla-02 (rc=0)
Jan 21 16:32:06 ogg-dvla-02 crmd: [3789]: info: do_lrm_rsc_op: Performing 
op=vip_monitor_20000 key=9:11:4f1659a1-c670-4bfc-972a-341fd26c1aba)
Jan 21 16:32:06 ogg-dvla-02 tengine: [7204]: info: send_rsc_command: Initiating 
action 9: vip_monitor_20000 on ogg-dvla-02
Jan 21 16:32:06 ogg-dvla-02 tengine: [7204]: info: send_rsc_command: Initiating 
action 11: ldirector_start_0 on ogg-dvla-02
Jan 21 16:32:06 ogg-dvla-02 crmd: [3789]: info: do_lrm_rsc_op: Performing 
op=ldirector_start_0 key=11:11:4f1659a1-c670-4bfc-972a-341fd26c1aba)
Jan 21 16:32:06 ogg-dvla-02 lrmd: [3786]: info: rsc:ldirector: start
Jan 21 16:32:06 ogg-dvla-02 crmd: [3789]: info: process_lrm_event: LRM 
operation vip_monitor_20000 (call=23, rc=0) complete
Jan 21 16:32:06 ogg-dvla-02 tengine: [7204]: info: match_graph_event: Action 
vip_monitor_20000 (9) confirmed on ogg-dvla-02 (rc=0)
Jan 21 16:32:07 ogg-dvla-02 crmd: [3789]: info: process_lrm_event: LRM 
operation ldirector_start_0 (call=24, rc=0) complete
Jan 21 16:32:07 ogg-dvla-02 tengine: [7204]: info: match_graph_event: Action 
ldirector_start_0 (11) confirmed on ogg-dvla-02 (rc=0)
Jan 21 16:32:07 ogg-dvla-02 tengine: [7204]: info: te_pseudo_action: Pseudo 
action 14 fired and confirmed
Jan 21 16:32:07 ogg-dvla-02 tengine: [7204]: info: send_rsc_command: Initiating 
action 12: ldirector_monitor_20000 on ogg-dvla-02
Jan 21 16:32:07 ogg-dvla-02 crmd: [3789]: info: do_lrm_rsc_op: Performing 
op=ldirector_monitor_20000 key=12:11:4f1659a1-c670-4bfc-972a-341fd26c1aba)
Jan 21 16:32:07 ogg-dvla-02 crmd: [3789]: notice: crmd_client_status_callback: 
Status update: Client ogg-dvla-01/crmd now has status [offline]
Jan 21 16:32:07 ogg-dvla-02 crmd: [3789]: info: process_lrm_event: LRM 
operation ldirector_monitor_20000 (call=25, rc=0) complete
Jan 21 16:32:07 ogg-dvla-02 tengine: [7204]: info: match_graph_event: Action 
ldirector_monitor_20000 (12) confirmed on ogg-dvla-02 (rc=0)
Jan 21 16:32:07 ogg-dvla-02 cib: [3785]: info: cib_process_shutdown_req: 
Shutdown REQ from ogg-dvla-01
Jan 21 16:32:08 ogg-dvla-02 cib: [3785]: info: sync_our_cib: Syncing CIB to 
ogg-dvla-01
Jan 21 16:32:08 ogg-dvla-02 ccm: [3784]: debug: quorum plugin: majority
Jan 21 16:32:08 ogg-dvla-02 crmd: [3789]: info: mem_handle_event: Got an event 
OC_EV_MS_INVALID from ccm
Jan 21 16:32:08 ogg-dvla-02 ccm: [3784]: debug: cluster:linux-ha, 
member_count=1, member_quorum_votes=100
Jan 21 16:32:08 ogg-dvla-02 crmd: [3789]: info: mem_handle_event: no mbr_track 
info
Jan 21 16:32:08 ogg-dvla-02 ccm: [3784]: debug: total_node_count=2, 
total_quorum_votes=200
Jan 21 16:32:08 ogg-dvla-02 crmd: [3789]: info: mem_handle_event: Got an event 
OC_EV_MS_NEW_MEMBERSHIP from ccm
Jan 21 16:32:08 ogg-dvla-02 ccm: [3784]: debug: quorum plugin: twonodes
Jan 21 16:32:08 ogg-dvla-02 crmd: [3789]: info: mem_handle_event: instance=70, 
nodes=1, new=0, lost=1, n_idx=0, new_idx=1, old_idx=3
Jan 21 16:32:08 ogg-dvla-02 ccm: [3784]: debug: cluster:linux-ha, 
member_count=1, member_quorum_votes=100
Jan 21 16:32:08 ogg-dvla-02 crmd: [3789]: info: crmd_ccm_msg_callback: Quorum 
(re)attained after event=NEW MEMBERSHIP (id=70)
Jan 21 16:32:08 ogg-dvla-02 ccm: [3784]: debug: total_node_count=2, 
total_quorum_votes=200
Jan 21 16:32:08 ogg-dvla-02 crmd: [3789]: info: ccm_event_detail: NEW 
MEMBERSHIP: trans=70, nodes=1, new=0, lost=1 n_idx=0, new_idx=1, old_idx=3
Jan 21 16:32:08 ogg-dvla-02 ccm: [3784]: info: Break tie for 2 nodes cluster
Jan 21 16:32:08 ogg-dvla-02 crmd: [3789]: info: ccm_event_detail:       
CURRENT: ogg-dvla-02 [nodeid=1, born=70]
Jan 21 16:32:08 ogg-dvla-02 crmd: [3789]: info: ccm_event_detail:       LOST:   
 ogg-dvla-01 [nodeid=0, born=69]
Jan 21 16:32:08 ogg-dvla-02 cib: [3785]: info: cib_client_status_callback: 
Status update: Client ogg-dvla-01/cib now has status [leave]
Jan 21 16:32:08 ogg-dvla-02 cib: [3785]: info: mem_handle_event: Got an event 
OC_EV_MS_INVALID from ccm
Jan 21 16:32:08 ogg-dvla-02 cib: [3785]: info: mem_handle_event: no mbr_track 
info
Jan 21 16:32:08 ogg-dvla-02 cib: [3785]: info: mem_handle_event: Got an event 
OC_EV_MS_NEW_MEMBERSHIP from ccm
Jan 21 16:32:08 ogg-dvla-02 cib: [3785]: info: mem_handle_event: instance=70, 
nodes=1, new=0, lost=1, n_idx=0, new_idx=1, old_idx=3
Jan 21 16:32:08 ogg-dvla-02 cib: [3785]: info: cib_ccm_msg_callback: LOST: 
ogg-dvla-01
Jan 21 16:32:08 ogg-dvla-02 cib: [3785]: info: cib_ccm_msg_callback: PEER: 
ogg-dvla-02
Jan 21 16:32:08 ogg-dvla-02 tengine: [7204]: info: run_graph: Transition 11: 
(Complete=12, Pending=0, Fired=0, Skipped=0, Incomplete=0)
Jan 21 16:32:08 ogg-dvla-02 tengine: [7204]: info: notify_crmd: Transition 11 
status: te_complete - <null>
Jan 21 16:32:08 ogg-dvla-02 crmd: [3789]: info: do_state_transition: State 
transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS 
cause=C_IPC_MESSAGE origin=route_message ]
Jan 21 16:32:23 ogg-dvla-02 sshd[19003]: Accepted publickey for root from 
192.168.176.253 port 51320 ssh2
Jan 21 16:32:39 ogg-dvla-02 heartbeat: [3275]: WARN: node ogg-dvla-01: is dead
Jan 21 16:32:39 ogg-dvla-02 crmd: [3789]: notice: crmd_ha_status_callback: 
Status update: Node ogg-dvla-01 now has status [dead]
Jan 21 16:32:39 ogg-dvla-02 heartbeat: [3275]: info: Link ogg-dvla-01:eth0 dead.


Perhaps it is because I'm rebooting through init rather than pulling the
ethernet cable or power cable out but I don't have physical access to
the box.

Does anyone have any ideas with what's happening? Versions and config
are below. Thanks.

Suse Enterprise 10 SP2

heartbeat-2.1.3-0.9
heartbeat-pils-2.1.3-0.9
heartbeat-stonith-2.1.3-0.9
heartbeat-ldirectord-2.1.3-0.9
heartbeat-cmpi-2.1.3-0.9

/etc/ha.d/ha.cf:
crm on
udpport 694
ucast eth0 10.167.30.71
ucast eth0 10.167.30.73
node ogg-dvla-01
node ogg-dvla-02

/etc/ha.d/ldirectord.cf:
# /etc/ha.d/ldirectord.cf
checktimeout=3
checkinterval=5
autoreload=yes
logfile="/var/log/ldirectord.log"
quiescent=yes
virtual=10.167.30.76:80
        [email protected]
        fallback=127.0.0.1:80
        real=10.167.30.71:80 gate
        real=10.167.30.73:80 gate
        service=http
        request="test.html"
        receive="Still alive"
        scheduler=wlc
        protocol=tcp
        checktype=negotiate
virtual=10.167.30.76:3306
        [email protected]
        fallback=127.0.0.1:3306
        real=10.167.30.71:3306 gate
        real=10.167.30.73:3306 gate
        service=mysql
        login="root"
        passwd="password"
        database="ldirector"
        request="SELECT * from connectioncheck;"
        scheduler=wlc
        protocol=tcp
        checktype=negotiate


CIB resources:

 <resources>
   <group id="load_balancer">
     <meta_attributes id="load_balancer_meta_attrs">
       <attributes>
         <nvpair id="load_balancer_metaattr_target_role" name="target_role" 
value="started"/>
         <nvpair id="load_balancer_metaattr_ordered" name="ordered" 
value="true"/>
         <nvpair id="load_balancer_metaattr_collocated" name="collocated" 
value="true"/>
       </attributes>
     </meta_attributes>
     <primitive id="vip" class="ocf" type="IPaddr2" provider="heartbeat">
       <instance_attributes id="vip_instance_attrs">
         <attributes>
           <nvpair id="d87cd780-3a51-419d-ac47-0fb150ec155b" name="ip" 
value="10.167.30.76"/>
           <nvpair id="a390efaa-2b48-406b-b122-a60c2af7b809" name="lvs_support" 
value="true"/>
         </attributes>
       </instance_attributes>
       <operations>
         <op id="c1aa0a35-d440-4e06-806e-a2886d2cea0a" name="monitor" 
interval="20" timeout="10" start_delay="0" on_fail="restart"/>
       </operations>
     </primitive>
     <primitive id="ldirector" class="ocf" type="ldirectord" 
provider="heartbeat">
       <instance_attributes id="ldirector_instance_attrs">
         <attributes>
           <nvpair id="c89bec2c-82fa-4405-b672-bbc842ce108c" name="configfile" 
value="/etc/ha.d/ldirectord.cf"/>
         </attributes>
       </instance_attributes>
       <operations>
         <op id="d77c87dc-831c-4ff5-8bd5-ea269e7ed3f3" name="monitor" 
interval="20" timeout="10" start_delay="0" on_fail="restart"/>
       </operations>
     </primitive>
   </group>
 </resources>



-- 
Darren Mansell <[email protected]>
OpenGI Ltd
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to