Hi Ken,

I don't see any issues in the logs, periodically this is in the logs:

Jun 20 11:52:19 [5613] zfs-serv3       crmd:     info: crm_timer_popped:        
PEngine Recheck Timer (I_PE_CALC) just popped (900000ms)
Jun 20 11:52:19 [5613] zfs-serv3       crmd:   notice: do_state_transition:     
State transition S_IDLE -> S_POLICY_ENGINE | input=I_PE_CALC 
cause=C_TIMER_POPPED origin=crm_timer_popped
Jun 20 11:52:19 [5613] zfs-serv3       crmd:     info: do_state_transition:     
Progressed to state S_POLICY_ENGINE after C_TIMER_POPPED
Jun 20 11:52:19 [5612] zfs-serv3    pengine:     info: process_pe_message:      
Input has not changed since last time, not saving to disk
Jun 20 11:52:19 [5612] zfs-serv3    pengine:     info: determine_online_status: 
Node zfs-serv3 is online
Jun 20 11:52:19 [5612] zfs-serv3    pengine:     info: determine_online_status: 
Node zfs-serv4 is online
Jun 20 11:52:19 [5612] zfs-serv3    pengine:     info: native_print:    
vm_storage      (ocf::heartbeat:ZFS):   Stopped
Jun 20 11:52:19 [5612] zfs-serv3    pengine:     info: native_print:    ha-ip   
(ocf::heartbeat:IPaddr2):       Stopped
Jun 20 11:52:19 [5612] zfs-serv3    pengine:     info: native_print:    
resIPMI-zfs4    (stonith:external/ipmi):        Started zfs-serv3
Jun 20 11:52:19 [5612] zfs-serv3    pengine:     info: native_print:    
resIPMI-zfs3    (stonith:external/ipmi):        Started zfs-serv4
Jun 20 11:52:19 [5612] zfs-serv3    pengine:     info: native_print:    
nfs-server      (systemd:nfs-server):   Stopped
Jun 20 11:52:19 [5612] zfs-serv3    pengine:     info: RecurringOp:      Start 
recurring monitor (5s) for vm_storage on zfs-serv3
Jun 20 11:52:19 [5612] zfs-serv3    pengine:     info: RecurringOp:      Start 
recurring monitor (10s) for ha-ip on zfs-serv3
Jun 20 11:52:19 [5612] zfs-serv3    pengine:     info: RecurringOp:      Start 
recurring monitor (60s) for nfs-server on zfs-serv3
Jun 20 11:52:19 [5612] zfs-serv3    pengine:   notice: LogActions:      Start   
vm_storage      (zfs-serv3)
Jun 20 11:52:19 [5612] zfs-serv3    pengine:   notice: LogActions:      Start   
ha-ip   (zfs-serv3)
Jun 20 11:52:19 [5612] zfs-serv3    pengine:     info: LogActions:      Leave   
resIPMI-zfs4    (Started zfs-serv3)
Jun 20 11:52:19 [5612] zfs-serv3    pengine:     info: LogActions:      Leave   
resIPMI-zfs3    (Started zfs-serv4)
Jun 20 11:52:19 [5612] zfs-serv3    pengine:   notice: LogActions:      Start   
nfs-server      (zfs-serv3)
Jun 20 11:52:19 [5612] zfs-serv3    pengine:   notice: process_pe_message:      
Calculated transition 80, saving inputs in 
/var/lib/pacemaker/pengine/pe-input-228.bz2
Jun 20 11:52:19 [5613] zfs-serv3       crmd:     info: do_state_transition:     
State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE | input=I_PE_SUCCESS 
cause=C_IPC_MESSAGE origin=handle_response
Jun 20 11:52:19 [5613] zfs-serv3       crmd:     info: do_te_invoke:    
Processing graph 80 (ref=pe_calc-dc-1529488339-113) derived from 
/var/lib/pacemaker/pengine/pe-input-228.bz2
Jun 20 11:52:19 [5613] zfs-serv3       crmd:  warning: run_graph:       
Transition 80 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=6, 
Source=/var/lib/pacemaker/pengine/pe-input-228.bz2): Terminated
Jun 20 11:52:19 [5613] zfs-serv3       crmd:  warning: te_graph_trigger:        
Transition failed: terminated
Jun 20 11:52:19 [5613] zfs-serv3       crmd:   notice: print_graph:     Graph 
80 with 6 actions: batch-limit=0 jobs, network-delay=60000ms
Jun 20 11:52:19 [5613] zfs-serv3       crmd:   notice: print_synapse:   [Action 
   5]: Pending rsc op vm_storage_monitor_5000             on zfs-serv3 
(priority: 0, waiting:  4)
Jun 20 11:52:19 [5613] zfs-serv3       crmd:   notice: print_synapse:   [Action 
   4]: Pending rsc op vm_storage_start_0                  on zfs-serv3 
(priority: 0, waiting:  12)
Jun 20 11:52:19 [5613] zfs-serv3       crmd:   notice: print_synapse:   [Action 
   7]: Pending rsc op ha-ip_monitor_10000                 on zfs-serv3 
(priority: 0, waiting:  6)
Jun 20 11:52:19 [5613] zfs-serv3       crmd:   notice: print_synapse:   [Action 
   6]: Pending rsc op ha-ip_start_0                       on zfs-serv3 
(priority: 0, waiting:  4 12)
Jun 20 11:52:19 [5613] zfs-serv3       crmd:   notice: print_synapse:   [Action 
  13]: Pending rsc op nfs-server_monitor_60000            on zfs-serv3 
(priority: 0, waiting:  12)
Jun 20 11:52:19 [5613] zfs-serv3       crmd:   notice: print_synapse:   [Action 
  12]: Pending rsc op nfs-server_start_0                  on zfs-serv3 
(priority: 0, waiting:  4)
Jun 20 11:52:19 [5613] zfs-serv3       crmd:     info: do_log:  Input 
I_TE_SUCCESS received in state S_TRANSITION_ENGINE from notify_crmd
Jun 20 11:52:19 [5613] zfs-serv3       crmd:   notice: do_state_transition:     
State transition S_TRANSITION_ENGINE -> S_IDLE | input=I_TE_SUCCESS 
cause=C_FSA_INTERNAL origin=notify_crmd
Jun 20 12:07:19 [5613] zfs-serv3       crmd:     info: crm_timer_popped:        
PEngine Recheck Timer (I_PE_CALC) just popped (900000ms)
Jun 20 12:07:19 [5613] zfs-serv3       crmd:   notice: do_state_transition:     
State transition S_IDLE -> S_POLICY_ENGINE | input=I_PE_CALC 
cause=C_TIMER_POPPED origin=crm_timer_popped
Jun 20 12:07:19 [5613] zfs-serv3       crmd:     info: do_state_transition:     
Progressed to state S_POLICY_ENGINE after C_TIMER_POPPED
Jun 20 12:07:19 [5612] zfs-serv3    pengine:     info: process_pe_message:      
Input has not changed since last time, not saving to disk
Jun 20 12:07:19 [5612] zfs-serv3    pengine:     info: determine_online_status: 
Node zfs-serv3 is online
Jun 20 12:07:19 [5612] zfs-serv3    pengine:     info: determine_online_status: 
Node zfs-serv4 is online
Jun 20 12:07:19 [5612] zfs-serv3    pengine:     info: native_print:    
vm_storage      (ocf::heartbeat:ZFS):   Stopped
Jun 20 12:07:19 [5612] zfs-serv3    pengine:     info: native_print:    ha-ip   
(ocf::heartbeat:IPaddr2):       Stopped
Jun 20 12:07:19 [5612] zfs-serv3    pengine:     info: native_print:    
resIPMI-zfs4    (stonith:external/ipmi):        Started zfs-serv3
Jun 20 12:07:19 [5612] zfs-serv3    pengine:     info: native_print:    
resIPMI-zfs3    (stonith:external/ipmi):        Started zfs-serv4
Jun 20 12:07:19 [5612] zfs-serv3    pengine:     info: native_print:    
nfs-server      (systemd:nfs-server):   Stopped
Jun 20 12:07:19 [5612] zfs-serv3    pengine:     info: RecurringOp:      Start 
recurring monitor (5s) for vm_storage on zfs-serv3
Jun 20 12:07:19 [5612] zfs-serv3    pengine:     info: RecurringOp:      Start 
recurring monitor (10s) for ha-ip on zfs-serv3
Jun 20 12:07:19 [5612] zfs-serv3    pengine:     info: RecurringOp:      Start 
recurring monitor (60s) for nfs-server on zfs-serv3
Jun 20 12:07:19 [5612] zfs-serv3    pengine:   notice: LogActions:      Start   
vm_storage      (zfs-serv3)
Jun 20 12:07:19 [5612] zfs-serv3    pengine:   notice: LogActions:      Start   
ha-ip   (zfs-serv3)
Jun 20 12:07:19 [5612] zfs-serv3    pengine:     info: LogActions:      Leave   
resIPMI-zfs4    (Started zfs-serv3)
Jun 20 12:07:19 [5612] zfs-serv3    pengine:     info: LogActions:      Leave   
resIPMI-zfs3    (Started zfs-serv4)
Jun 20 12:07:19 [5612] zfs-serv3    pengine:   notice: LogActions:      Start   
nfs-server      (zfs-serv3)
Jun 20 12:07:19 [5612] zfs-serv3    pengine:   notice: process_pe_message:      
Calculated transition 81, saving inputs in 
/var/lib/pacemaker/pengine/pe-input-228.bz2
Jun 20 12:07:19 [5613] zfs-serv3       crmd:     info: do_state_transition:     
State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE | input=I_PE_SUCCESS 
cause=C_IPC_MESSAGE origin=handle_response
Jun 20 12:07:19 [5613] zfs-serv3       crmd:     info: do_te_invoke:    
Processing graph 81 (ref=pe_calc-dc-1529489239-114) derived from 
/var/lib/pacemaker/pengine/pe-input-228.bz2
Jun 20 12:07:19 [5613] zfs-serv3       crmd:  warning: run_graph:       
Transition 81 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=6, 
Source=/var/lib/pacemaker/pengine/pe-input-228.bz2): Terminated
Jun 20 12:07:19 [5613] zfs-serv3       crmd:  warning: te_graph_trigger:        
Transition failed: terminated
Jun 20 12:07:19 [5613] zfs-serv3       crmd:   notice: print_graph:     Graph 
81 with 6 actions: batch-limit=0 jobs, network-delay=60000ms
Jun 20 12:07:19 [5613] zfs-serv3       crmd:   notice: print_synapse:   [Action 
   5]: Pending rsc op vm_storage_monitor_5000             on zfs-serv3 
(priority: 0, waiting:  4)
Jun 20 12:07:19 [5613] zfs-serv3       crmd:   notice: print_synapse:   [Action 
   4]: Pending rsc op vm_storage_start_0                  on zfs-serv3 
(priority: 0, waiting:  12)
Jun 20 12:07:19 [5613] zfs-serv3       crmd:   notice: print_synapse:   [Action 
   7]: Pending rsc op ha-ip_monitor_10000                 on zfs-serv3 
(priority: 0, waiting:  6)
Jun 20 12:07:19 [5613] zfs-serv3       crmd:   notice: print_synapse:   [Action 
   6]: Pending rsc op ha-ip_start_0                       on zfs-serv3 
(priority: 0, waiting:  4 12)
Jun 20 12:07:19 [5613] zfs-serv3       crmd:   notice: print_synapse:   [Action 
  13]: Pending rsc op nfs-server_monitor_60000            on zfs-serv3 
(priority: 0, waiting:  12)
Jun 20 12:07:19 [5613] zfs-serv3       crmd:   notice: print_synapse:   [Action 
  12]: Pending rsc op nfs-server_start_0                  on zfs-serv3 
(priority: 0, waiting:  4)
Jun 20 12:07:19 [5613] zfs-serv3       crmd:     info: do_log:  Input 
I_TE_SUCCESS received in state S_TRANSITION_ENGINE from notify_crmd
Jun 20 12:07:19 [5613] zfs-serv3       crmd:   notice: do_state_transition:     
State transition S_TRANSITION_ENGINE -> S_IDLE | input=I_TE_SUCCESS 
cause=C_FSA_INTERNAL origin=notify_crmd


> Then there should be messages from the crmd about "Initiating" the command 
> and obtaining its "Result".
After a restart of the cluster there is nothing useful in the logs:

grep crmd /var/log/corosync/corosync.log |egrep 'Initi|Result'
Jun 20 12:16:53 [27559] zfs-serv3       crmd:   notice: te_rsc_command: 
Initiating monitor operation vm_storage_monitor_0 locally on zfs-serv3 | action 
2
Jun 20 12:16:53 [27559] zfs-serv3       crmd:   notice: te_rsc_command: 
Initiating monitor operation ha-ip_monitor_0 locally on zfs-serv3 | action 3
Jun 20 12:16:53 [27559] zfs-serv3       crmd:   notice: te_rsc_command: 
Initiating monitor operation resIPMI-zfs4_monitor_0 locally on zfs-serv3 | 
action 4
Jun 20 12:16:53 [27559] zfs-serv3       crmd:   notice: te_rsc_command: 
Initiating monitor operation resIPMI-zfs3_monitor_0 locally on zfs-serv3 | 
action 5
Jun 20 12:16:53 [27559] zfs-serv3       crmd:   notice: te_rsc_command: 
Initiating monitor operation nfs-server_monitor_0 locally on zfs-serv3 | action 
6
Jun 20 12:16:53 [27559] zfs-serv3       crmd:   notice: process_lrm_event:      
Result of probe operation for resIPMI-zfs4 on zfs-serv3: 7 (not running) | 
call=13 key=resIPMI-zfs4_monitor_0 confirmed=true cib-update=31
Jun 20 12:16:53 [27559] zfs-serv3       crmd:   notice: process_lrm_event:      
Result of probe operation for resIPMI-zfs3 on zfs-serv3: 7 (not running) | 
call=17 key=resIPMI-zfs3_monitor_0 confirmed=true cib-update=32
Jun 20 12:16:53 [27559] zfs-serv3       crmd:   notice: process_lrm_event:      
Result of probe operation for nfs-server on zfs-serv3: 7 (not running) | 
call=21 key=nfs-server_monitor_0 confirmed=true cib-update=33
Jun 20 12:16:53 [27559] zfs-serv3       crmd:   notice: process_lrm_event:      
Result of probe operation for vm_storage on zfs-serv3: 7 (not running) | call=5 
key=vm_storage_monitor_0 confirmed=true cib-update=34
Jun 20 12:16:53 [27559] zfs-serv3       crmd:   notice: process_lrm_event:      
Result of probe operation for ha-ip on zfs-serv3: 7 (not running) | call=9 
key=ha-ip_monitor_0 confirmed=true cib-update=35
Jun 20 12:17:06 [27559] zfs-serv3       crmd:   notice: te_rsc_command: 
Initiating monitor operation vm_storage_monitor_0 on zfs-serv4 | action 2
Jun 20 12:17:06 [27559] zfs-serv3       crmd:   notice: te_rsc_command: 
Initiating monitor operation ha-ip_monitor_0 on zfs-serv4 | action 3
Jun 20 12:17:06 [27559] zfs-serv3       crmd:   notice: te_rsc_command: 
Initiating monitor operation resIPMI-zfs4_monitor_0 on zfs-serv4 | action 4
Jun 20 12:17:06 [27559] zfs-serv3       crmd:   notice: te_rsc_command: 
Initiating monitor operation resIPMI-zfs3_monitor_0 on zfs-serv4 | action 5
Jun 20 12:17:06 [27559] zfs-serv3       crmd:   notice: te_rsc_command: 
Initiating monitor operation nfs-server_monitor_0 on zfs-serv4 | action 6
Jun 20 12:17:06 [27559] zfs-serv3       crmd:   notice: te_rsc_command: 
Initiating start operation resIPMI-zfs4_start_0 locally on zfs-serv3 | action 11
Jun 20 12:17:06 [27559] zfs-serv3       crmd:   notice: te_rsc_command: 
Initiating start operation resIPMI-zfs3_start_0 on zfs-serv4 | action 13
Jun 20 12:17:07 [27559] zfs-serv3       crmd:   notice: process_lrm_event:      
Result of start operation for resIPMI-zfs4 on zfs-serv3: 0 (ok) | call=22 
key=resIPMI-zfs4_start_0 confirmed=true cib-update=61
Jun 20 12:17:07 [27559] zfs-serv3       crmd:   notice: te_rsc_command: 
Initiating monitor operation resIPMI-zfs3_monitor_60000 on zfs-serv4 | action 14
Jun 20 12:17:07 [27559] zfs-serv3       crmd:   notice: te_rsc_command: 
Initiating monitor operation resIPMI-zfs4_monitor_60000 locally on zfs-serv3 | 
action 12
Jun 20 12:17:08 [27559] zfs-serv3       crmd:     info: process_lrm_event:      
Result of monitor operation for resIPMI-zfs4 on zfs-serv3: 0 (ok) | call=23 
key=resIPMI-zfs4_monitor_60000 confirmed=false cib-update=62


Now i will try that: 
https://wiki.clusterlabs.org/wiki/Debugging_Resource_Failures, maybe it help

best regards
Stefan
_______________________________________________
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to