Hi All,
We encountered an issue when starting/stopping lsb resources (crm resource
start test_agent_clone) in our 3 node cluster setup as per the log extract
below.
There is no issue starting the service manually i.e.. service testservice
start/stop and also ran compatibility check on the lsb scripts as described
here http://linux-ha.org/wiki/LSB_Resource_Agents.
The resource is declared as follows..
primitive test_agent lsb:testservice \
op monitor interval="10" timeout="30" start-delay="60" \
op start interval="0" timeout="60" \
op stop interval="0" timeout="60"
….
….
….
clone test_agent_clone test_agent \
meta target-role="Started" is-managed="true"
Appreciate help in resolving this issue..
Cheers,
Jimmy.
06:58:50 node03 cib[3799]: notice: cib:diff: Diff: --- 1.330.19
06:58:50 node03 cib[3799]: notice: cib:diff: Diff: +++ 1.331.1
7a89ee0288a607e2b56bf66d6a3acf50
06:58:50 node03 cib[3799]: notice: cib:diff: -- <nvpair
value="Stopped" id="test_agent_clone-meta_attributes-target-role" />
06:58:50 node03 cib[3799]: notice: cib:diff: ++ <nvpair
id="test_agent_clone-meta_attributes-target-role" name="target-role"
value="Started" />
06:58:50 node03 crmd[3804]: notice: do_state_transition: State transition
S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL
origin=abort_transition_graph ]
06:58:50 node03 pengine[3803]: notice: unpack_config: On loss of CCM Quorum:
Ignore
06:58:50 node03 pengine[3803]: crit: get_timet_now: Defaulting to 'now'
06:58:50 node03 pengine[3803]: crit: get_timet_now: Defaulting to 'now'
06:58:50 node03 pengine[3803]: crit: get_timet_now: Defaulting to 'now'
06:58:50 node03 pengine[3803]: crit: get_timet_now: Defaulting to 'now'
06:58:50 node03 pengine[3803]: warning: unpack_rsc_op: Processing failed op
monitor for test_agent:0 on node02: not running (7)
06:58:50 node03 pengine[3803]: crit: get_timet_now: Defaulting to 'now'
06:58:50 node03 pengine[3803]: crit: get_timet_now: Defaulting to 'now'
06:58:50 node03 pengine[3803]: warning: unpack_rsc_op: Processing failed op
monitor for test_agent:0 on node03: not running (7)
06:58:50 node03 pengine[3803]: notice: LogActions: Start
test_agent:0#011(node02)
06:58:50 node03 pengine[3803]: notice: LogActions: Start
test_agent:1#011(node03)
06:58:50 node03 pengine[3803]: notice: LogActions: Start
test_agent:2#011(node01)
06:58:50 node03 pengine[3803]: notice: process_pe_message: Calculated
Transition 919: /var/lib/pacemaker/pengine/pe-input-227.bz2
06:58:50 node03 crmd[3804]: error: crm_xml_err: XML Error: Entity: line 1:
parser error : invalid character in attribute value
06:58:50 node03 crmd[3804]: error: crm_xml_err: XML Error:
:0:7063dfed-0c75-4608-9f1f-258e9874ad22" lrmd_rsc_output="Starting testservice:
06:58:50 node03 crmd[3804]: error: crm_xml_err: XML Error:
^
06:58:50 node03 crmd[3804]: error: crm_xml_err: XML Error: Entity: line 1:
parser error : attributes construct error
06:58:50 node03 crmd[3804]: error: crm_xml_err: XML Error:
:0:7063dfed-0c75-4608-9f1f-258e9874ad22" lrmd_rsc_output="Starting testservice:
06:58:50 node03 crmd[3804]: error: crm_xml_err: XML Error:
^
06:58:50 node03 crmd[3804]: error: crm_xml_err: XML Error: Entity: line 1:
parser error : Couldn't find end of Start Tag lrmd_notify line 1
06:58:50 node03 crmd[3804]: error: crm_xml_err: XML Error:
:0:7063dfed-0c75-4608-9f1f-258e9874ad22" lrmd_rsc_output="Starting testservice:
06:58:50 node03 crmd[3804]: error: crm_xml_err: XML Error:
^
06:58:50 node03 crmd[3804]: error: crm_xml_err: XML Error: Entity: line 1:
parser error : Extra content at the end of the document
06:58:50 node03 crmd[3804]: error: crm_xml_err: XML Error:
:0:7063dfed-0c75-4608-9f1f-258e9874ad22" lrmd_rsc_output="Starting testservice:
06:58:50 node03 crmd[3804]: error: crm_xml_err: XML Error:
^
06:58:50 node03 crmd[3804]: warning: string2xml: Parsing failed (domain=1,
level=3, code=5): Extra content at the end of the document
06:58:50 node03 crmd[3804]: warning: string2xml: String start: <lrmd_notify
lrmd_origin="send_cmd_complete_notify
06:58:50 node03 crmd[3804]: warning: string2xml: String start+692: e="start"
CRM_meta_timeout="60000"/></lrmd_notify>
06:58:50 node03 crmd[3804]: error: crm_abort: string2xml: Forked child
18986 to record non-fatal assert at xml.c:605 : String parsing error
06:58:51 node03 abrt[18987]: Saved core dump of pid 18986
(/usr/libexec/pacemaker/crmd) to /var/spool/abrt/ccpp-2013-04-24-06:58:50-18986
(18911232 bytes)
06:58:51 node03 abrtd: Directory 'ccpp-2013-04-24-06:58:50-18986' creation
detected
06:58:51 node03 crmd[3804]: notice: process_lrm_event: LRM operation
test_agent_start_0 (call=611, rc=0, cib-update=1347, confirmed=true) ok
06:58:55 node03 abrtd: Sending an email...
06:58:56 node03 abrtd: Email was sent to: root@localhost
06:58:56 node03 abrtd: Duplicate: UUID
06:58:56 node03 abrtd: DUP_OF_DIR:
/var/spool/abrt/ccpp-2013-04-16-06:24:40-3992
06:58:56 node03 abrtd: Problem directory is a duplicate of
/var/spool/abrt/ccpp-2013-04-16-06:24:40-3992
06:58:56 node03 abrtd: Deleting problem directory
ccpp-2013-04-24-06:58:50-18986 (dup of ccpp-2013-04-16-06:24:40-3992)
06:58:58 node03 crmd[3804]: notice: process_lrm_event: LRM operation
test_agent_monitor_10000 (call=330, rc=0, cib-update=1348, confirmed=false) ok
06:58:59 node03 crmd[3804]: notice: process_lrm_event: LRM operation
test_agent_monitor_10000 (call=556, rc=0, cib-update=1351, confirmed=false) ok
06:59:00 node03 crmd[3804]: notice: process_lrm_event: LRM operation
test_agent_monitor_10000 (call=473, rc=0, cib-update=1352, confirmed=false) ok
The following pacemaker rpms are installed on the system.
pacemaker-libs-1.1.8-7.el6.x86_64
pacemaker-1.1.8-7.el6.x86_64
pacemaker-cli-1.1.8-7.el6.x86_64
pacemaker-cluster-libs-1.1.8-7.el6.x86_64
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems