Re: [ClusterLabs] Keep printing "Sent 0 CPG messages" in corosync.log

2018-10-01 Thread Jan Friesse

lkxjtu,




Corosync.log has kept printing the following logs for several days. What's 
wrong with the corosync cluster? Now the cpu load is not high.


Interesting messages from logs you've sent are:

Sep 30 01:23:28 [127667] paas-controller-172-21-0-2 corosync warning 
[MAIN  ] timer_function_scheduler_timeout Corosync main process was not 
scheduled for 10470.3652 ms (threshold is 2400. ms). Consider token 
timeout increase.


and

Sep 30 01:23:29 [127667] paas-controller-172-21-0-2 corosync notice 
[TOTEM ] pause_flush Process pause detected for 8760 ms, flushing 
membership messages.



This means that corosync was unable to get required time to run. This 
can happen because of:
- (Most often) cluster is running in highly overloaded VMs (quite often 
cloud environments)
- Corosync doesn't have a RT priority or there is another RT priority 
task using most of the time

- I/O problem
- Misbehaving watchdog device
- Bug in corosync

Honza



Cluster version information:
[root@paas-controller-172-167-40-24:~]$ rpm -q corosync
corosync-2.4.0-9.el7_4.2.x86_64
[root@paas-controller-172-167-40-24:~]$ rpm -q pacemaker
pacemaker-1.1.16-12.el7_4.2.x86_64



Sep 30 01:23:27 [128232] paas-controller-172-21-0-2cib: info: 
crm_cs_flush: Sent 0 CPG messages  (13 remaining, last=363): Try again (6)

...
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Keep printing "Sent 0 CPG messages" in corosync.log

2018-09-29 Thread lkxjtu


Corosync.log has kept printing the following logs for several days. What's 
wrong with the corosync cluster? Now the cpu load is not high.

Cluster version information:
[root@paas-controller-172-167-40-24:~]$ rpm -q corosync
corosync-2.4.0-9.el7_4.2.x86_64
[root@paas-controller-172-167-40-24:~]$ rpm -q pacemaker
pacemaker-1.1.16-12.el7_4.2.x86_64



Sep 30 01:23:27 [128232] paas-controller-172-21-0-2cib: info: 
crm_cs_flush: Sent 0 CPG messages  (13 remaining, last=363): Try again (6)
Sep 30 01:23:28 [128234] paas-controller-172-21-0-2  attrd: info: 
crm_cs_flush: Sent 0 CPG messages  (13 remaining, last=14519): Try again (6)
Sep 30 01:23:28 [128232] paas-controller-172-21-0-2cib: info: 
crm_cs_flush: Sent 0 CPG messages  (13 remaining, last=363): Try again (6)
Sep 30 01:23:28 [128234] paas-controller-172-21-0-2  attrd: info: 
crm_cs_flush: Sent 0 CPG messages  (13 remaining, last=14519): Try again (6)
Sep 30 01:23:28 [128232] paas-controller-172-21-0-2cib: info: 
crm_cs_flush: Sent 0 CPG messages  (13 remaining, last=363): Try again (6)
Sep 30 01:23:28 [128234] paas-controller-172-21-0-2  attrd: info: 
crm_cs_flush: Sent 0 CPG messages  (13 remaining, last=14519): Try again (6)
Sep 30 01:23:28 [128232] paas-controller-172-21-0-2cib: info: 
crm_cs_flush: Sent 0 CPG messages  (13 remaining, last=363): Try again (6)
Sep 30 01:23:28 [127667] paas-controller-172-21-0-2 corosync warning [MAIN  ] 
timer_function_scheduler_timeout Corosync main process was not scheduled for 
10470.3652 ms (threshold is 2400. ms). Consider token timeout increase.
Sep 30 01:23:29 [128234] paas-controller-172-21-0-2  attrd: info: 
crm_cs_flush: Sent 0 CPG messages  (13 remaining, last=14519): Try again (6)
Sep 30 01:23:29 [128232] paas-controller-172-21-0-2cib: info: 
crm_cs_flush: Sent 0 CPG messages  (13 remaining, last=363): Try again (6)
Sep 30 01:23:29 [128234] paas-controller-172-21-0-2  attrd: info: 
crm_cs_flush: Sent 0 CPG messages  (13 remaining, last=14519): Try again (6)
Sep 30 01:23:29 [128232] paas-controller-172-21-0-2cib: info: 
crm_cs_flush: Sent 0 CPG messages  (13 remaining, last=363): Try again (6)
Sep 30 01:23:29 [128234] paas-controller-172-21-0-2  attrd: info: 
crm_cs_flush: Sent 0 CPG messages  (13 remaining, last=14519): Try again (6)
Sep 30 01:23:29 [128232] paas-controller-172-21-0-2cib: info: 
crm_cs_flush: Sent 0 CPG messages  (13 remaining, last=363): Try again (6)
Sep 30 01:23:29 [127667] paas-controller-172-21-0-2 corosync notice  [TOTEM ] 
pause_flush Process pause detected for 8760 ms, flushing membership messages.
Sep 30 01:23:30 [128234] paas-controller-172-21-0-2  attrd: info: 
crm_cs_flush: Sent 0 CPG messages  (13 remaining, last=14519): Try again (6)
Sep 30 01:23:30 [128232] paas-controller-172-21-0-2cib: info: 
crm_cs_flush: Sent 0 CPG messages  (13 remaining, last=363): Try again (6)
Sep 30 01:23:30 [128234] paas-controller-172-21-0-2  attrd: info: 
crm_cs_flush: Sent 0 CPG messages  (13 remaining, last=14519): Try again (6)
Sep 30 01:23:30 [128232] paas-controller-172-21-0-2cib: info: 
crm_cs_flush: Sent 0 CPG messages  (13 remaining, last=363): Try again (6)
Sep 30 01:23:30 [128234] paas-controller-172-21-0-2  attrd: info: 
crm_cs_flush: Sent 0 CPG messages  (13 remaining, last=14519): Try again (6)
Sep 30 01:23:30 [128232] paas-controller-172-21-0-2cib: info: 
crm_cs_flush: Sent 0 CPG messages  (13 remaining, last=363): Try again (6)
Sep 30 01:23:31 [128234] paas-controller-172-21-0-2  attrd: info: 
crm_cs_flush: Sent 0 CPG messages  (13 remaining, last=14519): Try again (6)
Sep 30 01:23:31 [128232] paas-controller-172-21-0-2cib: info: 
crm_cs_flush: Sent 0 CPG messages  (13 remaining, last=363): Try again (6)
Sep 30 01:23:31 [128234] paas-controller-172-21-0-2  attrd: info: 
crm_cs_flush: Sent 0 CPG messages  (13 remaining, last=14519): Try again (6)
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org