Re: [ClusterLabs] 答复: No slave is promoted to be master

Andrei Borzenkov Mon, 16 Apr 2018 22:10:54 -0700


Отправлено с iPhone


> 17 апр. 2018 г., в 7:16, 范国腾 <[email protected]> написал(а):
> 
> I check the status again. It is not not promoted but it promoted about 15 
> minutes after the cluster starts. 
> 
> I try in three labs and the results are same: The promotion happens 15 
> minutes after the cluster starts. 
> 
> Why is there about 15 minutes delay every time?
> 

That rings the bell. 15 minutes is default interval for time based rules 
re-evaluation; my understanding so far is that it alarm triggers other 
configuration changes (basically it runs policy engine to make decision). 

I had similar effect when I attempted to change quorum state directly, without 
going via external node events.

So it looks like whatever sets master scores does not trigger policy engine.


> 
> Apr 16 22:08:32 node1 attrd[16618]:  notice: Node sds1 state is now member
> Apr 16 22:08:32 node1 attrd[16618]:  notice: Node sds2 state is now member
> 
> ......
> 
> Apr 16 22:21:36 node1 pgsqlms(pgsqld)[18230]: INFO: Execute action monitor 
> and the result 0
> Apr 16 22:21:52 node1 pgsqlms(pgsqld)[18257]: INFO: Execute action monitor 
> and the result 0
> Apr 16 22:22:09 node1 pgsqlms(pgsqld)[18296]: INFO: Execute action monitor 
> and the result 0
> Apr 16 22:22:25 node1 pgsqlms(pgsqld)[18315]: INFO: Execute action monitor 
> and the result 0
> Apr 16 22:22:41 node1 pgsqlms(pgsqld)[18343]: INFO: Execute action monitor 
> and the result 0
> Apr 16 22:22:57 node1 pgsqlms(pgsqld)[18362]: INFO: Execute action monitor 
> and the result 0
> Apr 16 22:23:13 node1 pgsqlms(pgsqld)[18402]: INFO: Execute action monitor 
> and the result 0
> Apr 16 22:23:29 node1 pgsqlms(pgsqld)[18421]: INFO: Execute action monitor 
> and the result 0
> Apr 16 22:23:45 node1 pgsqlms(pgsqld)[18449]: INFO: Execute action monitor 
> and the result 0
> Apr 16 22:23:57 node1 crmd[16620]:  notice: State transition S_IDLE -> 
> S_POLICY_ENGINE
> Apr 16 22:23:57 node1 pengine[16619]:  notice: Promote pgsqld:0#011(Slave -> 
> Master sds1)
> Apr 16 22:23:57 node1 pengine[16619]:  notice: Start   master-vip#011(sds1)
> Apr 16 22:23:57 node1 pengine[16619]:  notice: Start   
> pgsql-master-ip#011(sds1)
> Apr 16 22:23:57 node1 pengine[16619]:  notice: Calculated transition 1, 
> saving inputs in /var/lib/pacemaker/pengine/pe-input-18.bz2
> Apr 16 22:23:57 node1 crmd[16620]:  notice: Initiating cancel operation 
> pgsqld_monitor_16000 locally on sds1
> Apr 16 22:23:57 node1 crmd[16620]:  notice: Initiating notify operation 
> pgsqld_pre_notify_promote_0 locally on sds1
> Apr 16 22:23:57 node1 crmd[16620]:  notice: Initiating notify operation 
> pgsqld_pre_notify_promote_0 on sds2
> Apr 16 22:23:58 node1 pgsqlms(pgsqld)[18467]: INFO: Promoting instance on 
> node "sds1"
> Apr 16 22:23:58 node1 pgsqlms(pgsqld)[18467]: INFO: Current node TL#LSN: 
> 4#117440512
> Apr 16 22:23:58 node1 pgsqlms(pgsqld)[18467]: INFO: Execute action notify and 
> the result 0
> Apr 16 22:23:58 node1 crmd[16620]:  notice: Result of notify operation for 
> pgsqld on sds1: 0 (ok)
> Apr 16 22:23:58 node1 crmd[16620]:  notice: Initiating promote operation 
> pgsqld_promote_0 locally on sds1
> Apr 16 22:23:58 node1 pgsqlms(pgsqld)[18499]: INFO: Waiting for the promote 
> to complete
> Apr 16 22:23:59 node1 pgsqlms(pgsqld)[18499]: INFO: Promote complete
> 
> 
> 
> [root@node1 ~]# crm_simulate -sL
> 
> Current cluster status:
> Online: [ sds1 sds2 ]
> 
> Master/Slave Set: pgsql-ha [pgsqld]
>     Masters: [ sds1 ]
>     Slaves: [ sds2 ]
> Resource Group: mastergroup
>     master-vip (ocf::heartbeat:IPaddr2):       Started sds1
> pgsql-master-ip        (ocf::heartbeat:IPaddr2):       Started sds1
> 
> Allocation scores:
> clone_color: pgsql-ha allocation score on sds1: 1
> clone_color: pgsql-ha allocation score on sds2: 1
> clone_color: pgsqld:0 allocation score on sds1: 1003
> clone_color: pgsqld:0 allocation score on sds2: 1
> clone_color: pgsqld:1 allocation score on sds1: 1
> clone_color: pgsqld:1 allocation score on sds2: 1002
> native_color: pgsqld:0 allocation score on sds1: 1003
> native_color: pgsqld:0 allocation score on sds2: 1
> native_color: pgsqld:1 allocation score on sds1: -INFINITY
> native_color: pgsqld:1 allocation score on sds2: 1002
> pgsqld:0 promotion score on sds1: 1002
> pgsqld:1 promotion score on sds2: 1001
> group_color: mastergroup allocation score on sds1: 0
> group_color: mastergroup allocation score on sds2: 0
> group_color: master-vip allocation score on sds1: 0
> group_color: master-vip allocation score on sds2: 0
> native_color: master-vip allocation score on sds1: 1003
> native_color: master-vip allocation score on sds2: -INFINITY
> native_color: pgsql-master-ip allocation score on sds1: 1003
> native_color: pgsql-master-ip allocation score on sds2: -INFINITY
> 
> Transition Summary:
> [root@node1 ~]#
> 
> You could reproduce the issue in two nodes, and execute the following 
> command. Then run "pcs cluster stop --all" and "pcs cluster start --all".
> 
> pcs resource create pgsqld ocf:heartbeat:pgsqlms 
> bindir=/home/highgo/highgo/database/4.3.1/bin 
> pgdata=/home/highgo/highgo/database/4.3.1/data op start timeout=600s op stop 
> timeout=60s op promote timeout=300s op demote timeout=120s op monitor 
> interval=10s timeout=100s role="Master" op monitor interval=16s timeout=100s 
> role="Slave" op notify timeout=60s
> pcs resource master pgsql-ha pgsqld notify=true interleave=true
> 
> 
> 
> 
> 
> -----邮件原件-----
> 发件人: 范国腾 
> 发送时间: 2018年4月17日 10:25
> 收件人: 'Jehan-Guillaume de Rorthais' <[email protected]>
> 抄送: Cluster Labs - All topics related to open-source clustering welcomed 
> <[email protected]>
> 主题: [ClusterLabs] No slave is promoted to be master
> 
> Hi，
> 
> We install a new lab which only have the postgres resource and the vip 
> resource. After the cluster is installed, the status is ok: only node is 
> master and the other is slave. Then I run "pcs cluster stop --all" to close 
> the cluster and then I run the "pcs cluster start  --all" to start the 
> cluster. All of the pgsql is slave status and they could not be promoted to 
> be master any more like this:
> 
> Master/Slave Set: pgsql-ha [pgsqld]
>     Slaves: [ sds1 sds2 ] 
> 
> 
> There is no error in the log and the " crm_simulate -sL" show the flowing and 
> it seems that the score is ok too. The detailed log and config is in the 
> attachment.
> 
> [root@node1 ~]# crm_simulate -sL
> 
> Current cluster status:
> Online: [ sds1 sds2 ]
> 
> Master/Slave Set: pgsql-ha [pgsqld]
>     Slaves: [ sds1 sds2 ]
> Resource Group: mastergroup
>     master-vip (ocf::heartbeat:IPaddr2):       Stopped
> pgsql-master-ip        (ocf::heartbeat:IPaddr2):       Stopped
> 
> Allocation scores:
> clone_color: pgsql-ha allocation score on sds1: 1
> clone_color: pgsql-ha allocation score on sds2: 1
> clone_color: pgsqld:0 allocation score on sds1: 1003
> clone_color: pgsqld:0 allocation score on sds2: 1
> clone_color: pgsqld:1 allocation score on sds1: 1
> clone_color: pgsqld:1 allocation score on sds2: 1002
> native_color: pgsqld:0 allocation score on sds1: 1003
> native_color: pgsqld:0 allocation score on sds2: 1
> native_color: pgsqld:1 allocation score on sds1: -INFINITY
> native_color: pgsqld:1 allocation score on sds2: 1002
> pgsqld:0 promotion score on sds1: 1002
> pgsqld:1 promotion score on sds2: 1001
> group_color: mastergroup allocation score on sds1: 0
> group_color: mastergroup allocation score on sds2: 0
> group_color: master-vip allocation score on sds1: 0
> group_color: master-vip allocation score on sds2: 0
> native_color: master-vip allocation score on sds1: 1003
> native_color: master-vip allocation score on sds2: -INFINITY
> native_color: pgsql-master-ip allocation score on sds1: 1003
> native_color: pgsql-master-ip allocation score on sds2: -INFINITY
> 
> Transition Summary:
> * Promote pgsqld:0     (Slave -> Master sds1)
> * Start   master-vip   (sds1)
> * Start   pgsql-master-ip      (sds1)
> _______________________________________________
> Users mailing list: [email protected]
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
_______________________________________________
Users mailing list: [email protected]
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] 答复: No slave is promoted to be master

Reply via email to