On Dec 11, 2007, at 5:46 PM, Franck Ganachaud wrote:

If i may add a question, can i disable start or stop action on a resource with adding the following to a primitive?
<op disabled="true" id="mysql_orb_start" name="start"/>

no.  start is implicit.  sorry.

Franck

Franck Ganachaud a écrit :
Ok, I think I'll have to deal with the start/stop.

2 questions.
- Every time I start heartbeat, the first thing it does is checking mysql_orb and if it's running then stop it then try to start it. Why does heartbeat want to stop if the cloned resource was previously started? Can I change this behaviour?

probably because it thinks each instance is running on multiple nodes.

try adding globally_unique=false as a meta attribute for the clone


- When the mysql_orb monitor return KO on th server_a, group_1 ain't relocated to server_b

i dont understand - its running on both isnt it?


I copy the log for the second question.
------------ Log start --------------
Dec 11 14:32:38 server_a mysql_orb[24127]: [24133]: INFO: database is KO Dec 11 14:32:38 server_a crmd: [15760]: info: process_lrm_event: LRM operation mysql_orb1:0_monitor_30000 (call=27, rc=7) complete Dec 11 14:32:38 server_a cib: [15756]: info: cib_diff_notify: Update (client: 15760, call:85): 0.316.4076 -> 0.316.4077 (ok) Dec 11 14:32:38 server_a tengine: [15767]: info: te_update_diff: Processing diff (cib_update): 0.316.4076 -> 0.316.4077 Dec 11 14:32:38 server_a tengine: [15767]: info: process_graph_event: Detected action mysql_orb1:0_monitor_30000 from a different transitio
n: 5 vs. 9
Dec 11 14:32:38 server_a tengine: [15767]: info: update_abort_priority: Abort priority upgraded to 1000000 Dec 11 14:32:38 server_a tengine: [15767]: WARN: update_failcount: Updating failcount for mysql_orb1:0 on ffff098f- d059-4bf2-8aaa-58910ad9d
9a0 after failed monitor: rc=7
Dec 11 14:32:38 server_a crmd: [15760]: info: do_state_transition: server_a: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC c
ause=C_IPC_MESSAGE origin=route_message ]
Dec 11 14:32:38 server_a crmd: [15760]: info: do_state_transition: All 2 cluster nodes are eligable to run resources. Dec 11 14:32:38 server_a cib: [15756]: info: cib_diff_notify: Update (client: 15767, call:5): 0.316.4077 -> 0.316.4078 (ok) Dec 11 14:32:38 server_a tengine: [15767]: info: te_update_diff: Processing diff (cib_modify): 0.316.4077 -> 0.316.4078 Dec 11 14:32:38 server_a tengine: [15767]: info: extract_event: Aborting on transient_attributes changes for ffff098f- d059-4bf2-8aaa-58910a
d9d9a0
Dec 11 14:32:38 server_a tengine: [15767]: info: te_update_diff: Aborting on transient_attributes deletions Dec 11 14:32:38 server_a pengine: [15768]: info: log_data_element: process_pe_message: [generation] <cib admin_epoch="0" cib-last- written=" Tue Dec 11 14:08:37 2007" cib_feature_revision="1.3" epoch="316" generated="true" have_quorum="true" ignore_dtd="false" num_peers="2" num_u pdates="4077" ccm_transition="2" dc_uuid="ffff098f- d059-4bf2-8aaa-58910ad9d9a0"/> Dec 11 14:32:38 server_a pengine: [15768]: notice: cluster_option: Using default value 'stop' for cluster option 'no-quorum-policy' Dec 11 14:32:38 server_a pengine: [15768]: notice: cluster_option: Using default value '60s' for cluster option 'cluster-delay' Dec 11 14:32:38 server_a pengine: [15768]: notice: cluster_option: Using default value '-1' for cluster option 'pe-error-series-max' Dec 11 14:32:39 server_a pengine: [15768]: notice: cluster_option: Using default value '-1' for cluster option 'pe-warn-series-max' Dec 11 14:32:39 server_a pengine: [15768]: notice: cluster_option: Using default value '-1' for cluster option 'pe-input-series-max' Dec 11 14:32:39 server_a pengine: [15768]: notice: cluster_option: Using default value 'true' for cluster option 'startup-fencing' Dec 11 14:32:39 server_a pengine: [15768]: info: determine_online_status: Node server_a is online Dec 11 14:32:39 server_a pengine: [15768]: info: determine_online_status: Node server_b is online Dec 11 14:32:39 server_a pengine: [15768]: info: clone_print: Clone Set: MySQL_ORB Dec 11 14:32:39 server_a pengine: [15768]: info: native_print: mysql_orb1:0 (heartbeat::ocf:mysql_orb): Started server_a Dec 11 14:32:39 server_a pengine: [15768]: info: native_print: mysql_orb1:1 (heartbeat::ocf:mysql_orb): Started server_b Dec 11 14:32:39 server_a pengine: [15768]: info: group_print: Resource Group: group_1 Dec 11 14:32:39 server_a pengine: [15768]: info: native_print: IPaddr_192_168_87_100 (heartbeat::ocf:IPaddr): Started server_a Dec 11 14:32:39 server_a pengine: [15768]: info: native_print: apache_2 (heartbeat::ocf:apache): Started server_a Dec 11 14:32:39 server_a cib: [24134]: info: write_cib_contents: Wrote version 0.316.4078 of the CIB to disk (digest: 47227e2fc6e2d490dda64
562d69194ac)
Dec 11 14:32:39 server_a pengine: [15768]: notice: NoRoleChange: Leave resource mysql_orb1:0 (server_a) Dec 11 14:32:39 server_a pengine: [15768]: notice: NoRoleChange: Leave resource mysql_orb1:1 (server_b) Dec 11 14:32:39 server_a pengine: [15768]: info: native_color: Combine scores from apache_2 and IPaddr_192_168_87_100 Dec 11 14:32:39 server_a pengine: [15768]: notice: NoRoleChange: Leave resource IPaddr_192_168_87_100 (server_a) Dec 11 14:32:40 server_a pengine: [15768]: notice: NoRoleChange: Leave resource apache_2 (server_a)
------------ Log stop --------------


Andrew Beekhof a écrit :

On Dec 11, 2007, at 12:00 PM, Franck Ganachaud wrote:

Thanks Andrew

I change the configuration to this :

  <resources>
<clone id="MySQL_ORB" interleave="false" is_managed="true" notify="false" ordered="false">
      <instance_attributes id="MySQL_ORB_inst_attr">
        <attributes>
<nvpair id="MySQL_ORB_attr_0" name="clone_max" value="2"/> <nvpair id="MySQL_ORB_attr_1" name="clone_node_max" value="1"/>
        </attributes>
      </instance_attributes>
<primitive class="ocf" id="mysql_orb1" is_managed="false" provider="heartbeat" type="mysql_orb">
        <operations>
<op id="mysql_orb_mon" interval="30s" name="monitor" on_fail="stop" timeout="30s"/>
        </operations>
      </primitive>
    </clone>
    <group id="group_1" restart_type="restart">
<primitive class="ocf" id="IPaddr_Cluster" provider="heartbeat" type="IPaddr">
        <operations>
<op id="IPaddr_Cluster_mon" interval="5s" name="monitor" timeout="5s"/>
        </operations>
        <instance_attributes id="IPaddr_Cluster_inst_attr">
          <attributes>
<nvpair id="IPaddr_Cluster_attr_0" name="ip" value="192.168.87.100"/>
          </attributes>
        </instance_attributes>
      </primitive>
<primitive class="ocf" id="apache_2" provider="heartbeat" type="apache">
        <operations>
<op id="apache_2_mon" interval="30s" name="monitor" timeout="30s"/>
        </operations>
        <instance_attributes id="apache_2_inst_attr">
          <attributes>
<nvpair id="apache_2_attr_0" name="configfile" value="/usr/local/apache/conf/httpd.conf"/>
          </attributes>
        </instance_attributes>
      </primitive>
    </group>
  </resources>
  <constraints>
    <rsc_location id="rsc_location_group_1" rsc="group_1">
      <rule id="prefered_location_group_1" score="100">
<expression attribute="#uname" id="prefered_location_group_1_expr" operation="eq" value="server_a"/>
      </rule>
    </rsc_location>
<rsc_colocation from="group_1" id="web_if_mysql" score="INFINITY" to="MySQL_ORB"/>
  </constraints>


With this configuration, DB monitoring doesn't occur anymore. For the is_managed parameter, I don't know what to set between the clone and primitive inside the clone to have the good behaviour : don't start/stop it at startup, just monitor.

hmmm... i forgot we don't start recurring monitor actions for unmanaged resources.

if the monitor was already running, then we wouldn't stop it if you change the resource to unmanaged... but that doesn't help you :-(
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to