I have been getting extremely strange behavior from a Corosync/Pacemaker 
install on OVH Public Cloud servers.

After hours of Googling, I thought I would try posting here to see if somebody 
knows what to do.

I see this in my logs very frequently:
Jan 31 10:31:36 [581] fusion01-2 corosync warning [MAIN  ] Corosync main 
process was not scheduled for 24334.5645 ms (threshold is 2400.0000 ms). 
Consider token timeout increase.
Jan 31 10:31:36 [581] fusion01-2 corosync notice  [TOTEM ] A processor failed, 
forming new configuration.

I have increased token time to 10s and this still occurs regularly even though 
both hosts are always up.

There are also times when the floating IP script is fired, but corosync does 
not seem aware. When you run crm status it will show the ip being bound the 
fusion01-1 when in fact the script fired and moved it to fusion01-2.

Finally, the timing of the logs seems very odd. This is what the end of my 
corosync log file looks like. Notice the times appear out of order in the logs. 
I'm ripping my hair out with these issues. Anybody have a clue what may be 
going on here?

Jan 31 10:31:47 [21062] fusion01-2        cib:     info: cib_process_request:   
            Completed cib_modify operation for section status: OK (rc=0, 
origin=fusion01-1/crmd/245, version=0.81.123)
Jan 31 10:31:47 [21062] fusion01-2        cib:     info: cib_perform_op:        
Diff: --- 0.81.123 2
Jan 31 10:31:47 [21062] fusion01-2        cib:     info: cib_perform_op:        
Diff: +++ 0.81.124 (null)
Jan 31 10:31:47 [21062] fusion01-2        cib:     info: cib_perform_op:        
-- /cib/status/node_state[@id='1']/lrm[@id='1']
Jan 31 10:31:47 [21062] fusion01-2        cib:     info: cib_perform_op:        
+  /cib:  @num_updates=124
Jan 31 10:31:47 [21062] fusion01-2        cib:     info: cib_process_request:   
            Completed cib_delete operation for section 
//node_state[@uname='fusion01-1']/lrm: OK (rc=0, origin=fusion01-1/crmd/246, 
version=0.81.124)
Jan 31 10:31:47 [21062] fusion01-2        cib:     info: cib_perform_op:        
Diff: --- 0.81.124 2
Jan 31 10:31:47 [21062] fusion01-2        cib:     info: cib_perform_op:        
Diff: +++ 0.81.125 (null)
Jan 31 10:31:47 [21062] fusion01-2        cib:     info: cib_perform_op:        
+  /cib:  @num_updates=125
Jan 31 10:31:47 [21062] fusion01-2        cib:     info: cib_perform_op:        
+  /cib/status/node_state[@id='1']:  @crm-debug-origin=do_lrm_query_internal
Jan 31 10:31:47 [21062] fusion01-2        cib:     info: cib_perform_op:        
++ /cib/status/node_state[@id='1']:  <lrm id="1"/>
Jan 31 10:31:47 [21062] fusion01-2        cib:     info: cib_perform_op:        
++                                     <lrm_resources>
Jan 31 10:31:47 [21062] fusion01-2        cib:     info: cib_perform_op:        
++                                       <lrm_resource id="FloatIP" 
type="floatip-ocf" class="ocf" provider="ovh">
Jan 31 10:31:47 [21062] fusion01-2        cib:     info: cib_perform_op:        
++                                         <lrm_rsc_op id="FloatIP_last_0" 
operation_key="FloatIP_start_0" operation="start" 
crm-debug-origin="build_active_RAs" crm_feature_set="3.0.10" 
transition-key="5:31:0:08c3f481-ccde-4f75-b1a7-acf8168cd0c1" 
transition-magic="0:0;5:31:0:08c3f481-ccde-4f75-b1a7-acf8168cd0c1" 
on_node="fusion01-1" call-id="17" rc-code="0" op-status="0" interval="0" 
last-run="1485859189" last-rc-change="1485859189" e
Jan 31 10:31:47 [21062] fusion01-2        cib:     info: cib_perform_op:        
++                                       </lrm_resource>
Jan 31 10:31:47 [21062] fusion01-2        cib:     info: cib_perform_op:        
++                                       <lrm_resource id="FS" type="FSSofia" 
class="lsb">
Jan 31 10:31:47 [21062] fusion01-2        cib:     info: cib_perform_op:        
++                                         <lrm_rsc_op id="FS_last_failure_0" 
operation_key="FS_monitor_0" operation="monitor" 
crm-debug-origin="build_active_RAs" crm_feature_set="3.0.10" 
transition-key="4:1:7:1fe20aa3-b305-4282-99a3-b1f8190d3c2c" 
transition-magic="0:0;4:1:7:1fe20aa3-b305-4282-99a3-b1f8190d3c2c" 
on_node="fusion01-1" call-id="9" rc-code="0" op-status="0" interval="0" 
last-run="1485760597" last-rc-change="1485760597" ex
Jan 31 10:31:47 [21062] fusion01-2        cib:     info: cib_perform_op:        
++                                         <lrm_rsc_op id="FS_last_0" 
operation_key="FS_start_0" operation="start" 
crm-debug-origin="build_active_RAs" crm_feature_set="3.0.10" 
transition-key="7:31:0:08c3f481-ccde-4f75-b1a7-acf8168cd0c1" 
transition-magic="0:0;7:31:0:08c3f481-ccde-4f75-b1a7-acf8168cd0c1" 
on_node="fusion01-1" call-id="18" rc-code="0" op-status="0" interval="0" 
last-run="1485859191" last-rc-change="1485859191" exec-time="
Jan 31 10:31:47 [21062] fusion01-2        cib:     info: cib_perform_op:        
++                                         <lrm_rsc_op id="FS_monitor_1000" 
operation_key="FS_monitor_1000" operation="monitor" 
crm-debug-origin="build_active_RAs" crm_feature_set="3.0.10" 
transition-key="1:31:0:08c3f481-ccde-4f75-b1a7-acf8168cd0c1" 
transition-magic="0:0;1:31:0:08c3f481-ccde-4f75-b1a7-acf8168cd0c1" 
on_node="fusion01-1" call-id="19" rc-code="0" op-status="0" interval="1000" 
last-rc-change="1485859191" exec-time="20" qu
Jan 31 10:31:47 [21062] fusion01-2        cib:     info: cib_perform_op:        
++                                       </lrm_resource>
Jan 31 10:31:47 [21062] fusion01-2        cib:     info: cib_perform_op:        
++                                     </lrm_resources>
Jan 31 10:31:47 [21062] fusion01-2        cib:     info: cib_perform_op:        
++                                   </lrm>
Jan 31 10:31:47 [21062] fusion01-2        cib:     info: cib_process_request:   
            Completed cib_modify operation for section status: OK (rc=0, 
origin=fusion01-1/crmd/247, version=0.81.125)
Jan 31 10:31:47 [21062] fusion01-2        cib:     info: cib_process_request:   
            Completed cib_modify operation for section nodes: OK (rc=0, 
origin=fusion01-1/crmd/250, version=0.81.125)
Jan 31 10:31:47 [21062] fusion01-2        cib:     info: cib_perform_op:        
Diff: --- 0.81.125 2
Jan 31 10:31:47 [21062] fusion01-2        cib:     info: cib_perform_op:        
Diff: +++ 0.81.126 (null)
Jan 31 10:31:47 [21062] fusion01-2        cib:     info: cib_perform_op:        
+  /cib:  @num_updates=126
Jan 31 10:31:47 [21062] fusion01-2        cib:     info: cib_perform_op:        
+  /cib/status/node_state[@id='2']:  @crm-debug-origin=do_state_transition
Jan 31 10:31:47 [21062] fusion01-2        cib:     info: cib_perform_op:        
+  /cib/status/node_state[@id='1']:  @crm-debug-origin=do_state_transition
Jan 31 10:31:47 [21062] fusion01-2        cib:     info: cib_process_request:   
            Completed cib_modify operation for section status: OK (rc=0, 
origin=fusion01-1/crmd/251, version=0.81.126)
Jan 31 10:31:47 [21062] fusion01-2        cib:     info: cib_process_request:   
            Completed cib_modify operation for section cib: OK (rc=0, 
origin=fusion01-1/crmd/252, version=0.81.126)
Jan 31 10:31:47 [21062] fusion01-2        cib:     info: cib_process_request:   
            Completed cib_modify operation for section status: OK (rc=0, 
origin=fusion01-1/attrd/16, version=0.81.126)
Jan 31 10:31:47 [21062] fusion01-2        cib:     info: cib_perform_op:        
Diff: --- 0.81.126 2
Jan 31 10:31:47 [21062] fusion01-2        cib:     info: cib_perform_op:        
Diff: +++ 0.81.127 (null)
Jan 31 10:31:47 [21062] fusion01-2        cib:     info: cib_perform_op:        
+  /cib:  @num_updates=127
Jan 31 10:31:47 [21062] fusion01-2        cib:     info: cib_process_request:   
            Completed cib_modify operation for section status: OK (rc=0, 
origin=fusion01-1/attrd/17, version=0.81.127)
Jan 31 10:31:52 [21062] fusion01-2        cib:     info: cib_process_ping:     
Reporting our current digest to fusion01-1: 8f48b8c10ce54828f87b27bea1b50d20 
for 0.81.127 (0x7ff1485dadc0 0)
Jan 31 10:32:16 [21067] fusion01-2       crmd:     info: throttle_send_command: 
  New throttle mode: 0000 (was 0010)
Jan 31 10:31:36 [581] fusion01-2 corosync warning [MAIN  ] Corosync main 
process was not scheduled for 24334.5645 ms (threshold is 2400.0000 ms). 
Consider token timeout increase.
Jan 31 10:31:36 [581] fusion01-2 corosync notice  [TOTEM ] A processor failed, 
forming new configuration.
Jan 31 10:31:36 [581] fusion01-2 corosync warning [MAIN  ] Corosync main 
process was not scheduled for 9222.6709 ms (threshold is 2400.0000 ms). 
Consider token timeout increase.
Jan 31 10:31:36 [581] fusion01-2 corosync notice  [TOTEM ] A new membership 
(192.168.128.21:6120) was formed. Members left: 1
Jan 31 10:31:36 [581] fusion01-2 corosync notice  [TOTEM ] Failed to receive 
the leave message. failed: 1
Jan 31 10:31:36 [581] fusion01-2 corosync notice  [QUORUM] Members[1]: 2
Jan 31 10:31:36 [581] fusion01-2 corosync notice  [MAIN  ] Completed service 
synchronization, ready to provide service.
Jan 31 10:31:46 [581] fusion01-2 corosync notice  [TOTEM ] A new membership 
(192.168.128.20:6128) was formed. Members joined: 1
Jan 31 10:31:46 [581] fusion01-2 corosync notice  [QUORUM] Members[2]: 1 2
Jan 31 10:31:46 [581] fusion01-2 corosync notice  [MAIN  ] Completed service 
synchronization, ready to provide service.

Cheers,
Corey Moullas

Network Administrator
EMAK|TECH

T: (514)-4000-226 x101
E: cmoul...@emak.tech<mailto:cmoul...@emak.tech>

IMPORTANT: For all support-related requests, please email 
supp...@emak.tech<mailto:supp...@emak.tech> / Pour toute assistance technique, 
veuillez envoyer un courriel à supp...@emak.tech<mailto:supp...@emak.tech>.

_______________________________________________
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to