Hi group,

I have a problem with cloned resource. The linux-ha.org links 
(http://linux-ha.org/v2/Concepts/Clones) are all returning "page not found" 
error.


I have a clone resource in my cib.xml. My configuration is two node (nodeA and 
nodeB) cluster with database as a cloned resource.

<clone id="database">
 <instance_attributes id="database_attributes">
   <attributes>
     <nvpair id="database_clone_max" name="clone_max" value="2"/>
     <nvpair id="database_clone_node_max" name="clone_node_max" value="1"/>
     <nvpair id="database_globally_unique" name="globally_unique" 
value="false"/>
   </attributes>
 </instance_attributes>
 <group id="database-grp">
   <primitive id="database_server" class="lsb" type="database">
     <operations>
       <op id="database_server_monitor" name="monitor" interval="120s" 
timeout="60s" start_delay="90s"/>
     </operations>
   </primitive>
   <primitive id="database-bs_server" class="lsb" type="database-bs">
     <operations>
       <op id="database-bs_server_monitor" name="monitor" interval="120s" 
timeout="60s" start_delay="90s"/>
     </operations>
   </primitive>
 </group>
</clone>

The system comes up fine and everything is working fine until heartbeat service 
in stopped on the active 'nodeA'. The database services on 'nodeA' go down and 
are not running. If I restart heartbeat services on 'nodeA', the database 
services (cloned resource) do not come up automatically. I can restart the 
database services from the console.

I added a location constraint to the cib.xml to force the database service 
cloned resources to always run on each node.

<rsc_location id="run_database-grp_on_node_a" rsc="database-grp:1">
        <rule id="run_database-grp_on_node_a_rule" score="INFINITY">
                <expression id="run_database-grp_on_node_a_rule_expr" 
attribute="#uname" operation="eq" value="nodeA"/>
        </rule>
</rsc_location>

<rsc_location id="run_database-grp_on_node_b" rsc="database-grp:0">
        <rule id="run_database-grp_on_node_b_rule" score="INFINITY">
                <expression id="run_database-grp_on_node_b_rule_expr" 
attribute="#uname" operation="eq" value="nodeB"/>
        </rule>
</rsc_location>


This solves the problem partially. When heartbeat service is stopped on nodeA, 
All resources (including database services) go down on nodeA. When heartbeat is 
restarted on nodeA, database services come up again. 

I don't think the above is the correct solution. Running crm_verify on the 
updated cib.xml reports the following errors:

crm_verify[15560]: 2009/09/04_05:36:16 ERROR: clone_color: database-grp:0 is 
running on nodeB which isn't allowed
crm_verify[15560]: 2009/09/04_05:36:16 ERROR: clone_color: database-grp:1 is 
running on nodeA which isn't allowed
crm_verify[15560]: 2009/09/04_05:36:16 ERROR: color_instance: 
415f21b1-23cc-44d5-95c0-d3889893a7fa not found in database (list=0)
crm_verify[15560]: 2009/09/04_05:36:16 ERROR: color_instance: 
21790cf9-1645-4cd1-910e-90e4942f1b76 not found in database (list=0)

  The first two errors are probably because of the explicit location constraint 
rules.

  This is with heartbeat v2.1.3.

  I have two questions:

1. When heartbeat service is stopped gracefully, does it stop all resources 
running on the node?

2. Is there a better solution to ensure that cloned resources always run on 
each node even after heartbeat restart or if cloned resource is stopped?



Regards,
Mahesh


_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to