Thank you Dejan,
I tried changing the script so that instead of requiring a "report" it
now takes status. Specifically I changed it from this:
report' )
"$mysqld_multi" report $2
;;
to this
status' )
"$mysqld_multi" report $2
;;
I was hoping this would return a proper status and allow a failover. The
messages disappeared in the log file so that was a good start. When I
killed mysql on the primary node however there was no failover and
crm_mon on both nodes seemed to indicate that mysql was still alive on
the primary node. I grabbed this from my log file:
Mar 30 10:20:31 DBSUAT1A.intranet.mydomain.com pengine: [15123]: info:
unpack_config: Node scores: 'red' = -INFINITY, 'yellow' = 0, 'green' = 0
Mar 30 10:20:31 DBSUAT1A.intranet.mydomain.com pengine: [15123]: info:
determine_online_status: Node dbsuat1b.intranet.mydomain.com is online
Mar 30 10:20:31 DBSUAT1A.intranet.mydomain.com pengine: [15123]: notice:
unpack_rsc_op: Operation mysqld_2_monitor_0 found resource mysqld_2
active on dbsuat1b.intranet.mydomain.com
Mar 30 10:20:31 DBSUAT1A.intranet.mydomain.com pengine: [15123]: info:
determine_online_status: Node dbsuat1a.intranet.mydomain.com is online
Mar 30 10:20:31 DBSUAT1A.intranet.mydomain.com pengine: [15123]: notice:
unpack_rsc_op: Operation mysqld_2_monitor_0 found resource mysqld_2
active on dbsuat1a.intranet.mydomain.com
Mar 30 10:20:31 DBSUAT1A.intranet.mydomain.com pengine: [15123]: notice:
group_print: Resource Group: group_1
Mar 30 10:20:31 DBSUAT1A.intranet.mydomain.com pengine: [15123]: notice:
native_print: IPaddr2_1 (ocf::heartbeat:IPaddr2): Started
dbsuat1a.intranet.mydomain.com
Mar 30 10:20:31 DBSUAT1A.intranet.mydomain.com pengine: [15123]: notice:
native_print: mysqld_2 (lsb:mysqld): Started
dbsuat1a.intranet.mydomain.com
Mar 30 10:20:31 DBSUAT1A.intranet.mydomain.com pengine: [15123]: notice:
LogActions: Leave resource IPaddr2_1 (Started
dbsuat1a.intranet.mydomain.com)
Mar 30 10:20:31 DBSUAT1A.intranet.mydomain.com pengine: [15123]: notice:
LogActions: Leave resource mysqld_2 (Started
dbsuat1a.intranet.mydomain.com)
Mar 30 10:20:31 DBSUAT1A.intranet.mydomain.com pengine: [15123]: info:
process_pe_message: Transition 7: PEngine Input stored in:
/usr/var/lib/pengine/pe-input-801.bz2
Mar 30 10:20:31 DBSUAT1A.intranet.mydomain.com crmd: [3300]: info:
do_state_transition: State transition S_POLICY_ENGINE ->
S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE
origin=handle_response ]
Mar 30 10:20:31 DBSUAT1A.intranet.mydomain.com pengine: [15123]: info:
process_pe_message: Configuration WARNINGs found during PE processing.
Please run "crm_verify -L" to identify issues.
Any ideas?
Dejan Muhamedagic wrote:
> Hi,
>
> On Tue, Mar 30, 2010 at 10:24:59AM -0300, mike wrote:
>
>> Also noticed another oddity. I killed mysql on the primary node fully
>> expecting it to either trigger a failover or a restart of mysql on the
>> primary node; I wasn't 100% sure which. Well, nothing happened. I do
>> however see a number of messages like this in the ha-log:
>>
>> Mar 30 08:59:27 DBSUAT1A.intranet.mydomain.com lrmd: [3297]: info: RA
>> output: (mysqld_2:monitor:stderr) Usage: /etc/init.d/mysqld
>> {start|stop|report|restart}
>>
>
> Looks like the script doesn't support the status action. If so,
> then it can't be used in a cluster.
>
> Thanks,
>
> Dejan
>
>
>> mike wrote:
>>
>>> Thanks for the reply Florian.
>>> I installed from tar ball so am a little unsure of the releases but
>>> looking at the READMEs I see this
>>> heartbeat-3.0.2
>>> Pacemaker-1-0-17 (I think)
>>>
>>> They are all fairly recent, I downloaded them fro hg.linux-ha.org about
>>> 3 months ago. If you know of a file I can check to be 100% sure of the
>>> version # let me know.
>>> Here's my configuration:
>>> cib.xml:
>>> <cib admin_epoch="0" epoch="9" validate-with="transitional-0.6"
>>> crm_feature_set="3.0.1" have-quorum="1" num_updates="25"
>>> cib-last-written="Mo
>>> n Mar 29 21:55:01 2010" dc-uuid="e99889ee-da15-4b09-bfc7-641e3ac0687f">
>>> <configuration>
>>> <crm_config>
>>> <cluster_property_set id="cib-bootstrap-options">
>>> <attributes>
>>> <nvpair id="cib-bootstrap-options-symmetric-cluster"
>>> name="symmetric-cluster" value="true"/>
>>> <nvpair id="cib-bootstrap-options-no-quorum-policy"
>>> name="no-quorum-policy" value="stop"/>
>>> <nvpair id="cib-bootstrap-options-default-resource-stickiness"
>>> name="default-resource-stickiness" value="0"/>
>>> <nvpair
>>> id="cib-bootstrap-options-default-resource-failure-stickiness"
>>> name="default-resource-failure-stickiness" value="0"/>
>>> <nvpair id="cib-bootstrap-options-stonith-enabled"
>>> name="stonith-enabled" value="false"/>
>>> <nvpair id="cib-bootstrap-options-stonith-action"
>>> name="stonith-action" value="reboot"/>
>>> <nvpair id="cib-bootstrap-options-startup-fencing"
>>> name="startup-fencing" value="true"/>
>>> <nvpair id="cib-bootstrap-options-stop-orphan-resources"
>>> name="stop-orphan-resources" value="true"/>
>>> <nvpair id="cib-bootstrap-options-stop-orphan-actions"
>>> name="stop-orphan-actions" value="true"/>
>>> <nvpair id="cib-bootstrap-options-remove-after-stop"
>>> name="remove-after-stop" value="false"/>
>>> <nvpair id="cib-bootstrap-options-short-resource-names"
>>> name="short-resource-names" value="true"/>
>>> <nvpair id="cib-bootstrap-options-transition-idle-timeout"
>>> name="transition-idle-timeout" value="5min"/>
>>> <nvpair id="cib-bootstrap-options-default-action-timeout"
>>> name="default-action-timeout" value="20s"/>
>>> <nvpair id="cib-bootstrap-options-is-managed-default"
>>> name="is-managed-default" value="true"/>
>>> <nvpair id="cib-bootstrap-options-cluster-delay"
>>> name="cluster-delay" value="60s"/>
>>> <nvpair id="cib-bootstrap-options-pe-error-series-max"
>>> name="pe-error-series-max" value="-1"/>
>>> <nvpair id="cib-bootstrap-options-pe-warn-series-max"
>>> name="pe-warn-series-max" value="-1"/>
>>> <nvpair id="cib-bootstrap-options-pe-input-series-max"
>>> name="pe-input-series-max" value="-1"/>
>>> <nvpair id="cib-bootstrap-options-dc-version"
>>> name="dc-version" value="1.0.6-17fe0022afda074a937d934b3eb625eccd1f01ef"/>
>>> <nvpair id="cib-bootstrap-options-cluster-infrastructure"
>>> name="cluster-infrastructure" value="Heartbeat"/>
>>> </attributes>
>>> </cluster_property_set>
>>> </crm_config>
>>> <nodes>
>>> <node id="e99889ee-da15-4b09-bfc7-641e3ac0687f"
>>> uname="dbsuat1b.intranet.mydomain.com" type="normal"/>
>>> <node id="db80324b-c9de-4995-a66a-eedf93abb42c"
>>> uname="dbsuat1a.intranet.mydomain.com" type="normal"/>
>>> </nodes>
>>> <resources>
>>> <group id="group_1">
>>> <primitive class="ocf" id="IPaddr2_1" provider="heartbeat"
>>> type="IPaddr2">
>>> <operations>
>>> <op id="IPaddr2_1_mon" interval="5s" name="monitor"
>>> timeout="5s"/>
>>> </operations>
>>> <instance_attributes id="IPaddr2_1_inst_attr">
>>> <attributes>
>>> <nvpair id="IPaddr2_1_attr_0" name="ip"
>>> value="172.28.185.49"/>
>>> </attributes>
>>> </instance_attributes>
>>> </primitive>
>>> <primitive class="lsb" id="mysqld_2" provider="heartbeat"
>>> type="mysqld">
>>> <operations>
>>> <op id="mysqld_2_mon" interval="120s" name="monitor"
>>> timeout="60s"/>
>>> </operations>
>>> </primitive>
>>> </group>
>>> </resources>
>>> <constraints>
>>> <rsc_location id="rsc_location_group_1" rsc="group_1">
>>> <rule id="prefered_location_group_1" score="100">
>>> <expression attribute="#uname"
>>> id="prefered_location_group_1_expr" operation="eq"
>>> value="DBSUAT1A.intranet.mydomain.com"/>
>>> </rule>
>>> </rsc_location>
>>> </constraints>
>>> </configuration>
>>> <status>
>>> <node_state id="e99889ee-da15-4b09-bfc7-641e3ac0687f"
>>> uname="dbsuat1b.intranet.mydomain.com" ha="active" in_ccm="true"
>>> crmd="online" join
>>> ="member" expected="member" crm-debug-origin="do_update_resource"
>>> shutdown="0">
>>> <transient_attributes id="e99889ee-da15-4b09-bfc7-641e3ac0687f">
>>> <instance_attributes
>>> id="status-e99889ee-da15-4b09-bfc7-641e3ac0687f">
>>> <attributes>
>>> <nvpair
>>> id="status-e99889ee-da15-4b09-bfc7-641e3ac0687f-probe_complete"
>>> name="probe_complete" value="true"/>
>>> </attributes>
>>> </instance_attributes>
>>> </transient_attributes>
>>> <lrm id="e99889ee-da15-4b09-bfc7-641e3ac0687f">
>>> <lrm_resources>
>>> <lrm_resource id="IPaddr2_1" type="IPaddr2" class="ocf"
>>> provider="heartbeat">
>>> <lrm_rsc_op id="IPaddr2_1_monitor_0" operation="monitor"
>>> crm-debug-origin="build_active_RAs" crm_feature_set="3.0.1" transition-k
>>> ey="4:1:7:443f1faa-26f0-4013-95b1-d0a43e4b7f6a"
>>> transition-magic="0:7;4:1:7:443f1faa-26f0-4013-95b1-d0a43e4b7f6a"
>>> call-id="2" rc-code="7" op-
>>> status="0" interval="0" last-run="1269914318"
>>> last-rc-change="1269914318" exec-time="190" queue-time="10"
>>> op-digest="e6e4647755681224d96a4ba7
>>> fc1a3391"/>
>>> <lrm_rsc_op id="IPaddr2_1_start_0" operation="start"
>>> crm-debug-origin="build_active_RAs" crm_feature_set="3.0.1" transition-key="
>>> 4:3:0:443f1faa-26f0-4013-95b1-d0a43e4b7f6a"
>>> transition-magic="0:0;4:3:0:443f1faa-26f0-4013-95b1-d0a43e4b7f6a"
>>> call-id="5" rc-code="0" op-stat
>>> us="0" interval="0" last-run="1269914319" last-rc-change="1269914319"
>>> exec-time="110" queue-time="0" op-digest="e6e4647755681224d96a4ba7fc1a3
>>> 391"/>
>>> <lrm_rsc_op id="IPaddr2_1_monitor_5000" operation="monitor"
>>> crm-debug-origin="build_active_RAs" crm_feature_set="3.0.1" transitio
>>> n-key="5:3:0:443f1faa-26f0-4013-95b1-d0a43e4b7f6a"
>>> transition-magic="0:0;5:3:0:443f1faa-26f0-4013-95b1-d0a43e4b7f6a"
>>> call-id="6" rc-code="0"
>>> op-status="0" interval="5000" last-run="1269914715"
>>> last-rc-change="1269914319" exec-time="80" queue-time="0"
>>> op-digest="8124f1b5e7c7c10bbbf3
>>> 82d3813c9b90"/>
>>> <lrm_rsc_op id="IPaddr2_1_stop_0" operation="stop"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.1"
>>> transition-key="
>>> 6:6:0:443f1faa-26f0-4013-95b1-d0a43e4b7f6a"
>>> transition-magic="0:0;6:6:0:443f1faa-26f0-4013-95b1-d0a43e4b7f6a"
>>> call-id="10" rc-code="0" op-sta
>>> tus="0" interval="0" last-run="1269914720" last-rc-change="1269914720"
>>> exec-time="60" queue-time="0" op-digest="e6e4647755681224d96a4ba7fc1a3
>>> 391"/>
>>> </lrm_resource>
>>> <lrm_resource id="mysqld_2" type="mysqld" class="lsb">
>>> <lrm_rsc_op id="mysqld_2_monitor_0" operation="monitor"
>>> crm-debug-origin="build_active_RAs" crm_feature_set="3.0.1" transition-ke
>>> y="5:1:7:443f1faa-26f0-4013-95b1-d0a43e4b7f6a"
>>> transition-magic="0:0;5:1:7:443f1faa-26f0-4013-95b1-d0a43e4b7f6a"
>>> call-id="3" rc-code="0" op-s
>>> tatus="0" interval="0" last-run="1269914319" last-rc-change="1269914319"
>>> exec-time="0" queue-time="10" op-digest="f2317cad3d54cec5d7d7aa7d0bf
>>> 35cf8"/>
>>> <lrm_rsc_op id="mysqld_2_stop_0" operation="stop"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.1"
>>> transition-key="1
>>> 0:5:0:443f1faa-26f0-4013-95b1-d0a43e4b7f6a"
>>> transition-magic="0:0;10:5:0:443f1faa-26f0-4013-95b1-d0a43e4b7f6a"
>>> call-id="9" rc-code="0" op-sta
>>> tus="0" interval="0" last-run="1269914720" last-rc-change="1269914720"
>>> exec-time="180" queue-time="0" op-digest="f2317cad3d54cec5d7d7aa7d0bf3
>>> 5cf8"/>
>>> <lrm_rsc_op id="mysqld_2_start_0" operation="start"
>>> crm-debug-origin="build_active_RAs" crm_feature_set="3.0.1"
>>> transition-key="7
>>> :3:0:443f1faa-26f0-4013-95b1-d0a43e4b7f6a"
>>> transition-magic="0:0;7:3:0:443f1faa-26f0-4013-95b1-d0a43e4b7f6a"
>>> call-id="7" rc-code="0" op-statu
>>> s="0" interval="0" last-run="1269914319" last-rc-change="1269914319"
>>> exec-time="1130" queue-time="0" op-digest="f2317cad3d54cec5d7d7aa7d0bf35
>>> cf8"/>
>>> <lrm_rsc_op id="mysqld_2_monitor_120000" operation="monitor"
>>> crm-debug-origin="build_active_RAs" crm_feature_set="3.0.1" transiti
>>> on-key="8:3:0:443f1faa-26f0-4013-95b1-d0a43e4b7f6a"
>>> transition-magic="0:0;8:3:0:443f1faa-26f0-4013-95b1-d0a43e4b7f6a"
>>> call-id="8" rc-code="0"
>>> op-status="0" interval="120000" last-run="1269914681"
>>> last-rc-change="1269914321" exec-time="0" queue-time="0"
>>> op-digest="873ed4f07792aa8ff1
>>> 8f3254244675ea"/>
>>> </lrm_resource>
>>> </lrm_resources>
>>> </lrm>
>>> </node_state>
>>> <node_state id="db80324b-c9de-4995-a66a-eedf93abb42c"
>>> uname="dbsuat1a.intranet.mydomain.com" ha="active" join="member"
>>> crm-debug-origin="
>>> do_update_resource" crmd="online" shutdown="0" in_ccm="true"
>>> expected="member">
>>> <lrm id="db80324b-c9de-4995-a66a-eedf93abb42c">
>>> <lrm_resources>
>>> <lrm_resource id="mysqld_2" type="mysqld" class="lsb">
>>> <lrm_rsc_op id="mysqld_2_monitor_0" operation="monitor"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.1" transition-
>>> key="8:4:7:443f1faa-26f0-4013-95b1-d0a43e4b7f6a"
>>> transition-magic="0:0;8:4:7:443f1faa-26f0-4013-95b1-d0a43e4b7f6a"
>>> call-id="3" rc-code="0" op
>>> -status="0" interval="0" last-run="1269914718"
>>> last-rc-change="1269914718" exec-time="90" queue-time="0"
>>> op-digest="f2317cad3d54cec5d7d7aa7d0
>>> bf35cf8"/>
>>> <lrm_rsc_op id="mysqld_2_stop_0" operation="stop"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.1"
>>> transition-key="1
>>> 1:5:0:443f1faa-26f0-4013-95b1-d0a43e4b7f6a"
>>> transition-magic="0:0;11:5:0:443f1faa-26f0-4013-95b1-d0a43e4b7f6a"
>>> call-id="4" rc-code="0" op-sta
>>> tus="0" interval="0" last-run="1269914720" last-rc-change="1269914720"
>>> exec-time="310" queue-time="0" op-digest="f2317cad3d54cec5d7d7aa7d0bf3
>>> 5cf8"/>
>>> <lrm_rsc_op id="mysqld_2_start_0" operation="start"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.1"
>>> transition-key=
>>> "9:6:0:443f1faa-26f0-4013-95b1-d0a43e4b7f6a"
>>> transition-magic="0:0;9:6:0:443f1faa-26f0-4013-95b1-d0a43e4b7f6a"
>>> call-id="7" rc-code="0" op-sta
>>> tus="0" interval="0" last-run="1269914723" last-rc-change="1269914723"
>>> exec-time="220" queue-time="0" op-digest="f2317cad3d54cec5d7d7aa7d0bf3
>>> 5cf8"/>
>>> <lrm_rsc_op id="mysqld_2_monitor_120000" operation="monitor"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.1" transi
>>> tion-key="10:6:0:443f1faa-26f0-4013-95b1-d0a43e4b7f6a"
>>> transition-magic="0:0;10:6:0:443f1faa-26f0-4013-95b1-d0a43e4b7f6a"
>>> call-id="8" rc-code
>>> ="0" op-status="0" interval="120000" last-run="1269914725"
>>> last-rc-change="1269914725" exec-time="0" queue-time="0"
>>> op-digest="873ed4f07792aa
>>> 8ff18f3254244675ea"/>
>>> </lrm_resource>
>>> <lrm_resource id="IPaddr2_1" type="IPaddr2" class="ocf"
>>> provider="heartbeat">
>>> <lrm_rsc_op id="IPaddr2_1_monitor_0" operation="monitor"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.1" transition
>>> -key="7:4:7:443f1faa-26f0-4013-95b1-d0a43e4b7f6a"
>>> transition-magic="0:7;7:4:7:443f1faa-26f0-4013-95b1-d0a43e4b7f6a"
>>> call-id="2" rc-code="7" o
>>> p-status="0" interval="0" last-run="1269914718"
>>> last-rc-change="1269914718" exec-time="120" queue-time="0"
>>> op-digest="e6e4647755681224d96a4ba
>>> 7fc1a3391"/>
>>> <lrm_rsc_op id="IPaddr2_1_start_0" operation="start"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.1" transition-key
>>> ="7:6:0:443f1faa-26f0-4013-95b1-d0a43e4b7f6a"
>>> transition-magic="0:0;7:6:0:443f1faa-26f0-4013-95b1-d0a43e4b7f6a"
>>> call-id="5" rc-code="0" op-st
>>> atus="0" interval="0" last-run="1269914721" last-rc-change="1269914721"
>>> exec-time="110" queue-time="0" op-digest="e6e4647755681224d96a4ba7fc1
>>> a3391"/>
>>> <lrm_rsc_op id="IPaddr2_1_monitor_5000" operation="monitor"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.1" transit
>>> ion-key="8:6:0:443f1faa-26f0-4013-95b1-d0a43e4b7f6a"
>>> transition-magic="0:0;8:6:0:443f1faa-26f0-4013-95b1-d0a43e4b7f6a"
>>> call-id="6" rc-code="0
>>> " op-status="0" interval="5000" last-run="1269914723"
>>> last-rc-change="1269914723" exec-time="210" queue-time="0"
>>> op-digest="8124f1b5e7c7c10bb
>>> bf382d3813c9b90"/>
>>> </lrm_resource>
>>> </lrm_resources>
>>> </lrm>
>>> <transient_attributes id="db80324b-c9de-4995-a66a-eedf93abb42c">
>>> <instance_attributes
>>> id="status-db80324b-c9de-4995-a66a-eedf93abb42c">
>>> <attributes>
>>> <nvpair
>>> id="status-db80324b-c9de-4995-a66a-eedf93abb42c-probe_complete"
>>> name="probe_complete" value="true"/>
>>> </attributes>
>>> </instance_attributes>
>>> </transient_attributes>
>>> </node_state>
>>> </status>
>>> </cib>
>>>
>>> Florian Haas wrote:
>>>
>>>
>>>> Mike,
>>>>
>>>> the information given reduces us to guesswork.
>>>>
>>>> - Messaging layer?
>>>> - Pacemaker version?
>>>> - Glue and agents versions?
>>>> - crm configure show?
>>>> - Logs?
>>>>
>>>> Cheers,
>>>> Florian
>>>>
>>>> On 03/30/2010 03:48 AM, mike wrote:
>>>>
>>>>
>>>>
>>>>> So here's the situation:
>>>>>
>>>>> Node A (primary node) heartbeat up and running a VIP and mysqld
>>>>> Node B (secondary node) up and running but heartbeat stopped
>>>>>
>>>>> I start heartbeat on Node B and expect it to come quickly, which it
>>>>> does. I noticed in the logs on Node A that the cluster runs mysql start.
>>>>> Why would it do this when mysql is already running there? Doesn't seem
>>>>> to make sense to me.
>>>>>
>>>>>
>>>>>
>>>>
>>>> ------------------------------------------------------------------------
>>>>
>>>> _______________________________________________
>>>> Linux-HA mailing list
>>>> [email protected]
>>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>>>> See also: http://linux-ha.org/ReportingProblems
>>>>
>>>>
>>> _______________________________________________
>>> Linux-HA mailing list
>>> [email protected]
>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>>> See also: http://linux-ha.org/ReportingProblems
>>>
>>>
>>>
>>>
>> _______________________________________________
>> Linux-HA mailing list
>> [email protected]
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
>>
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
>
>
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems