On 02/18/2016 01:07 PM, Jeremy Matthews wrote:
> Hi,
> 
> We're having an issue with our cluster where after a reboot of our system a 
> location constraint reappears for the ClusterIP. This causes a problem, 
> because we have a daemon that checks the cluster state and waits until the 
> ClusterIP is started before it kicks off our application. We didn't have this 
> issue when using an earlier version of pacemaker. Here is the constraint as 
> shown by pcs:
> 
> [root@g5se-f3efce cib]# pcs constraint
> Location Constraints:
>   Resource: ClusterIP
>     Disabled on: g5se-f3efce (role: Started)
> Ordering Constraints:
> Colocation Constraints:
> 
> ...and here is our cluster status with the ClusterIP being Stopped:
> 
> [root@g5se-f3efce cib]# pcs status
> Cluster name: cl-g5se-f3efce
> Last updated: Thu Feb 18 11:36:01 2016
> Last change: Thu Feb 18 10:48:33 2016 via crm_resource on g5se-f3efce
> Stack: cman
> Current DC: g5se-f3efce - partition with quorum
> Version: 1.1.11-97629de
> 1 Nodes configured
> 4 Resources configured
> 
> 
> Online: [ g5se-f3efce ]
> 
> Full list of resources:
> 
> sw-ready-g5se-f3efce   (ocf::pacemaker:GBmon): Started g5se-f3efce
> meta-data      (ocf::pacemaker:GBmon): Started g5se-f3efce
> netmon (ocf::heartbeat:ethmonitor):    Started g5se-f3efce
> ClusterIP      (ocf::heartbeat:IPaddr2):       Stopped
> 
> 
> The cluster really just has one node at this time.
> 
> I retrieve the constraint ID, remove the constraint, verify that ClusterIP is 
> started,  and then reboot:
> 
> [root@g5se-f3efce cib]# pcs constraint ref ClusterIP
> Resource: ClusterIP
>   cli-ban-ClusterIP-on-g5se-f3efce
> [root@g5se-f3efce cib]# pcs constraint remove cli-ban-ClusterIP-on-g5se-f3efce
> 
> [root@g5se-f3efce cib]# pcs status
> Cluster name: cl-g5se-f3efce
> Last updated: Thu Feb 18 11:45:09 2016
> Last change: Thu Feb 18 11:44:53 2016 via crm_resource on g5se-f3efce
> Stack: cman
> Current DC: g5se-f3efce - partition with quorum
> Version: 1.1.11-97629de
> 1 Nodes configured
> 4 Resources configured
> 
> 
> Online: [ g5se-f3efce ]
> 
> Full list of resources:
> 
> sw-ready-g5se-f3efce   (ocf::pacemaker:GBmon): Started g5se-f3efce
> meta-data      (ocf::pacemaker:GBmon): Started g5se-f3efce
> netmon (ocf::heartbeat:ethmonitor):    Started g5se-f3efce
> ClusterIP      (ocf::heartbeat:IPaddr2):       Started g5se-f3efce
> 
> 
> [root@g5se-f3efce cib]# reboot
> 
> ....after reboot, log in, and the constraint is back and ClusterIP has not 
> started.
> 
> 
> I have noticed in /var/lib/pacemaker/cib that the cib-x.raw files get created 
> when there are changes to the cib (cib.xml). After a reboot, I see the 
> constraint being added in a diff between .raw files:
> 
> [root@g5se-f3efce cib]# diff cib-7.raw cib-8.raw
> 1c1
> < <cib epoch="239" num_updates="0" admin_epoch="0" 
> validate-with="pacemaker-1.2" cib-last-written="Thu Feb 18 11:44:53 2016" 
> update-origin="g5se-f3efce" update-client="crm_resource" 
> crm_feature_set="3.0.9" have-quorum="1" dc-uuid="g5se-f3efce">
> ---
>> <cib epoch="240" num_updates="0" admin_epoch="0" 
>> validate-with="pacemaker-1.2" cib-last-written="Thu Feb 18 11:46:49 2016" 
>> update-origin="g5se-f3efce" update-client="crm_resource" 
>> crm_feature_set="3.0.9" have-quorum="1" dc-uuid="g5se-f3efce">
> 50c50,52
> <     <constraints/>
> ---
>>     <constraints>
>>       <rsc_location id="cli-ban-ClusterIP-on-g5se-f3efce" rsc="ClusterIP" 
>> role="Started" node="g5se-f3efce" score="-INFINITY"/>
>>     </constraints>
> 
> 
> I have also looked in /var/log/cluster/corosync.log and seen logs where it 
> seems the cib is getting updated. I'm not sure if the constraint is being put 
> back in at shutdown or at start up. I just don't understand why it's being 
> put back in. I don't think our daemon code or other scripts are doing this,  
> but it is something I could verify.

I would look at any scripts running around that time first. Constraints
that start with "cli-" were created by one of the CLI tools, so
something must be calling it. The most likely candidates are pcs
resource move/ban or crm_resource -M/--move/-B/--ban.

> ********************************
> 
> From "yum info pacemaker", my current version is:
> 
> Name        : pacemaker
> Arch        : x86_64
> Version     : 1.1.12
> Release     : 8.el6_7.2
> 
> My earlier version was:
> 
> Name        : pacemaker
> Arch        : x86_64
> Version     : 1.1.10
> Release     : 1.el6_4.4
> 
> I'm still using an earlier version pcs, because the new one seems to have 
> issues with python:
> 
> Name        : pcs
> Arch        : noarch
> Version     : 0.9.90
> Release     : 1.0.1.el6.centos
> 
> *******************************
> 
> If anyone has ideas on the cause or thoughts on this, anything would be 
> greatly appreciated.
> 
> Thanks!
> 
> 
> 
> Jeremy Matthews


_______________________________________________
Users mailing list: [email protected]
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to