Re: [Linux-HA] Default-resource-stickiness of infinity with DRBD not keeping Primary stuck

Dejan Muhamedagic Thu, 31 Jan 2008 01:44:10 -0800

Hi,

On Wed, Jan 30, 2008 at 04:08:14PM -0700, Daniel Stickney wrote:
> Thanks for your suggestion Dejan. I have installed the 2.1.3-2 CentOS 
> packages from http://people.centos.org/~hughesjr/heartbeat/5/i386/ and will 
> begin testing with them.
>
> I am willing to work as a tester for this 2.1.3 CentOS package. So far I 
> have noticed two things that could be improved in the package. Should I be 
> posting them here, or is there a better way to work with Johnny? Maybe he 
> will just see this posting and let me know.


I don't know for sure, but my guess is that it is probably the
best to send him your results.

> The first:
> While installing the heartbeat-2.1.3-2.el5.centos.i386 package it errors 
> out trying to add the user "hacluster" which already exists from my 
> previous install.
> ---------
> [EMAIL PROTECTED] ~]# userdel hacluster
> [EMAIL PROTECTED] ~]# userdel hacluster
> userdel: user hacluster does not exist
> [EMAIL PROTECTED] ~]# rpm -ivh heartbeat-2.1.3-2.el5.centos.i386.rpm
> Preparing... ########################################### [100%]
> useradd: user hacluster exists

This looks odd. You first removed the user by hand and yet it is
found again. The original heartbeat.spec also deals with that, so
I guess that the centos one should too. Not sure what's the deal
here.

> error: %pre(heartbeat-2.1.3-2.el5.centos.i386) scriptlet failed, exit 
> status 9
> error: install: %pre scriptlet failed (2), skipping 
> heartbeat-2.1.3-2.el5.centos
> [EMAIL PROTECTED] ~]# rpm -ivh heartbeat-2.1.3-2.el5.centos.i386.rpm --force
> Preparing... ########################################### [100%]
> 1:heartbeat ########################################### [100%]
> [EMAIL PROTECTED] ~]#
> ---------
> I have to run "rpm -ivh heartbeat-2.1.3-2.el5.centos.i386.rpm --force" to 
> get it to install. Hopefully the script logic can deal with the user 
> already existing and not error out.

> The second:
> After installing and starting, when running "crm_mon" the standard 
> "console" mode is not available and mentions curses must be available at 
> compile time.
> ---------
> # crm_mon -i 1
> Defaulting to one-shot mode
> You need to have curses available at compile time to enable console mode
> ---------
> This appears to want ncurses-devel installed on the compile system.

Right. Please report this too. Though curses is not required,
there is no reason not to include it when it's available.

> Also, thanks for your work on heartbeat. It is an excellent high 
> availability solution.

Thanks,

Dejan

> Thanks
> -Daniel
>
>
> Dejan Muhamedagic wrote:
>> Hi,
>>
>> On Tue, Jan 29, 2008 at 01:25:57PM -0700, Daniel Stickney wrote:
>>   
>>> Hello everyone,
>>>
>>> Our setup: CentOS 5 (kernel 2.6.18-53), Heartbeat 
>>> heartbeat-2.1.2-3.el5.centos, DRBD drbd-8.0.6-1.el5.centos
>>>
>>> We are running into a problem with getting the master DRBD resource to 
>>> stick on a node it has failed onto. We have a simple 2 node cluster for 
>>> demonstration of the issue, halinux1 and halinux2, with a single DRBD 
>>> resource. What we are seeing is halinux2 selected as the Master node for 
>>> DRBD on heartbeat startup, halinux1 as the slave. When halinux2 is placed 
>>> into standby, the halinux1 is promoted to DRBD master as expected. When 
>>> halinux2 is taken out of standby mode, halinux1 is demoted to secondary 
>>> and halinux2 is promoted to master. We don't want this failback action. 
>>> We want the DRBD master to stay on whatever node it is on unless there is 
>>> a failure requiring it to move. We have default-resource-stickiness set 
>>> to "infinity" in our cib.xml file. I repeated this experiment with a 
>>> single IP address resource (no DRBD), and the stickiness of infinity 
>>> worked exactly as expected: the IP stayed on whatever node it was on 
>>> unless there was a failure (or standby mode) on the local node requiring 
>>> the IP to move, so that was a positive confirmation that outside of our 
>>> testing with DRBD, the stickiness of infinity works. We would very much 
>>> appreciate suggestions on how we might go about resolving this issue.
>>>     
>>
>> The multistate resources should have been much improved in
>> version 2.1.3. Johnny Hughes, the CentOS heartbeat maintainer,
>> has 2.1.3 available and is looking for testers:
>>
>> http://marc.info/?l=linux-ha&m=120110530418348&w=2
>>
>> Thanks,
>>
>> Dejan
>>
>>
>>   
>>> Here is the cib.xml file:
>>> ----------------------------------
>>> <cib generated="true" admin_epoch="0" have_quorum="true" 
>>> ignore_dtd="false" num_peers="2" cib_feature_revision="1.3" epoch="35" 
>>> num_updates="1" cib-last-wr
>>> itten="Tue Jan 29 12:36:17 2008" ccm_transition="2" 
>>> dc_uuid="d2c440e4-9668-4a70-b7e2-de7f52834325">
>>>  <configuration>
>>>    <crm_config>
>>>      <cluster_property_set id="cluster_defaults">
>>>        <attributes>
>>>          <nvpair name="default-resource-stickiness" id="stickiness" 
>>> value="INFINITY"/>
>>>        </attributes>
>>>      </cluster_property_set>
>>>    </crm_config>
>>>    <nodes>
>>>      <node uname="halinux2" type="normal" 
>>> id="216a5f87-c472-4ce6-a3f1-7ce4f6dc1bae">
>>>        <instance_attributes 
>>> id="nodes-216a5f87-c472-4ce6-a3f1-7ce4f6dc1bae">
>>>          <attributes>
>>>            <nvpair name="standby" 
>>> id="standby-216a5f87-c472-4ce6-a3f1-7ce4f6dc1bae" value="false"/>
>>>          </attributes>
>>>        </instance_attributes>
>>>      </node>
>>>      <node uname="halinux1" type="normal" 
>>> id="d2c440e4-9668-4a70-b7e2-de7f52834325">
>>>        <instance_attributes 
>>> id="nodes-d2c440e4-9668-4a70-b7e2-de7f52834325">
>>>          <attributes>
>>>            <nvpair name="standby" 
>>> id="standby-d2c440e4-9668-4a70-b7e2-de7f52834325" value="false"/>
>>>          </attributes>
>>>        </instance_attributes>
>>>      </node>
>>>    </nodes>
>>>    <resources>
>>>      <master_slave id="ms-drbd0">
>>>        <meta_attributes id="ma-ms-drbd0">
>>>          <attributes>
>>>            <nvpair id="ma-ms-drbd0-1" name="clone_max" value="2"/>
>>>            <nvpair id="ma-ms-drbd0-2" name="clone_node_max" value="1"/>
>>>            <nvpair id="ma-ms-drbd0-3" name="master_max" value="1"/>
>>>            <nvpair id="ma-ms-drbd0-4" name="master_node_max" value="1"/>
>>>            <nvpair id="ma-ms-drbd0-5" name="notify" value="yes"/>
>>>            <nvpair id="ma-ms-drbd0-6" name="globally_unique" 
>>> value="false"/>
>>>            <nvpair id="ma-ms-drbd0-7" name="target_role" 
>>> value="started"/>
>>>          </attributes>
>>>        </meta_attributes>
>>>        <primitive id="DRBD" class="ocf" provider="heartbeat" type="drbd">
>>>          <instance_attributes id="ia-DRBD">
>>>            <attributes>
>>>              <nvpair id="ia-DRBD-1" name="drbd_resource" value="mysql"/>
>>>            </attributes>
>>>          </instance_attributes>
>>>        </primitive>
>>>      </master_slave>
>>>    </resources>
>>>    <constraints/>
>>>  </configuration>
>>> </cib>
>>> ----------------------------------
>>> =========================================================================
>>>
>>> Here is our ha.cf file:
>>> ----------------------------------
>>> use_logd yes
>>> udpport 695
>>> bcast eth0
>>> node    halinux1
>>> node    halinux2
>>> crm on
>>> ----------------------------------
>>> =========================================================================
>>>
>>> Here is a link to the /var/log/messages output on halinux1 starting from 
>>> the time when halinux2 comes out of standby mode and the unwanted 
>>> failback occurs: http://pastebin.com/m6e55f6b3
>>>
>>> Thank you in advance for your time,
>>> -Daniel
>>>
>>> -- 
>>> Daniel Stickney - Linux Systems Administrator
>>> Email: [EMAIL PROTECTED]
>>>
>>> _______________________________________________
>>> Linux-HA mailing list
>>> [email protected]
>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>>> See also: http://linux-ha.org/ReportingProblems
>>>     
>> _______________________________________________
>> Linux-HA mailing list
>> [email protected]
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
>>   
> -- 
>
> Daniel Stickney - Linux Systems Administrator
> Email: [EMAIL PROTECTED]
>
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Default-resource-stickiness of infinity with DRBD not keeping Primary stuck

Reply via email to