Hi Andrew, Updating each node on the cluster as below resolved the issue, so many thanks for the link and apologies for reposting similar query..
service pacemaker stop wget -O /etc/yum.repos.d/pacemaker.repo http://clusterlabs.org/rpm-next/rhel-6/clusterlabs.repo; yum install -y pacemaker cman pacemaker-debuginfo wget -P /etc/yum.repos.d/ http://download.opensuse.org/repositories/network:/ha-clustering/CentOS_CentOS-6/network:ha-clustering.repo; yum install crmsh.x86_64 service cman start service pacemaker start which updated the cluster with the following packages, no issues installing on test cluster.. pacemaker-libs-devel-1.1.9-1512.el6.x86_64 pacemaker-cts-1.1.9-1512.el6.x86_64 pacemaker-cli-1.1.9-1512.el6.x86_64 pacemaker-libs-1.1.9-1512.el6.x86_64 pacemaker-1.1.9-1512.el6.x86_64 pacemaker-debuginfo-1.1.9-1512.el6.x86_64 pacemaker-cluster-libs-1.1.9-1512.el6.x86_64 resource-agents-3.9.2-12.el6.x86_64 corosynclib-1.4.1-7.el6.x86_64 corosync-1.4.1-7.el6.x86_64 corosynclib-devel-1.4.1-7.el6.x86_64 cman-3.0.12.1-32.el6.x86_64 crmsh-1.2.5-55.3.x86_64 crmsh-debuginfo-1.2.5-55.1.x86_64 Unfortunately I did not have pacemaker-debuginfo installed on the cluster where the core dumps where generated, I just released that when I went to view it today. Cheers, Jimmy. On 26 Apr 2013, at 01:17, Andrew Beekhof <[email protected]> wrote: > I have followed up on the equivalent pacemaker mailing list thread. > Essentially I asked if the latest http://www.clusterlabs.org/rpm-next > packages helped and if someone could open up the core file and print the > contents of the input passed to string2xml() > > On 26/04/2013, at 2:00 AM, Jimmy Magee <[email protected]> wrote: > >> Hi All, >> >> We encountered an issue when starting/stopping lsb resources (crm resource >> start test_agent_clone) in our 3 node cluster setup as per the log extract >> below. >> There is no issue starting the service manually i.e.. service testservice >> start/stop and also ran compatibility check on the lsb scripts as described >> here http://linux-ha.org/wiki/LSB_Resource_Agents. >> The resource is declared as follows.. >> >> primitive test_agent lsb:testservice \ >> op monitor interval="10" timeout="30" start-delay="60" \ >> op start interval="0" timeout="60" \ >> op stop interval="0" timeout="60" >> …. >> …. >> …. >> >> clone test_agent_clone test_agent \ >> meta target-role="Started" is-managed="true" >> >> >> Appreciate help in resolving this issue.. >> >> Cheers, >> Jimmy. >> >> >> 06:58:50 node03 cib[3799]: notice: cib:diff: Diff: --- 1.330.19 >> 06:58:50 node03 cib[3799]: notice: cib:diff: Diff: +++ 1.331.1 >> 7a89ee0288a607e2b56bf66d6a3acf50 >> 06:58:50 node03 cib[3799]: notice: cib:diff: -- <nvpair >> value="Stopped" id="test_agent_clone-meta_attributes-target-role" /> >> 06:58:50 node03 cib[3799]: notice: cib:diff: ++ <nvpair >> id="test_agent_clone-meta_attributes-target-role" name="target-role" >> value="Started" /> >> 06:58:50 node03 crmd[3804]: notice: do_state_transition: State transition >> S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL >> origin=abort_transition_graph ] >> 06:58:50 node03 pengine[3803]: notice: unpack_config: On loss of CCM >> Quorum: Ignore >> 06:58:50 node03 pengine[3803]: crit: get_timet_now: Defaulting to 'now' >> 06:58:50 node03 pengine[3803]: crit: get_timet_now: Defaulting to 'now' >> 06:58:50 node03 pengine[3803]: crit: get_timet_now: Defaulting to 'now' >> 06:58:50 node03 pengine[3803]: crit: get_timet_now: Defaulting to 'now' >> 06:58:50 node03 pengine[3803]: warning: unpack_rsc_op: Processing failed op >> monitor for test_agent:0 on node02: not running (7) >> 06:58:50 node03 pengine[3803]: crit: get_timet_now: Defaulting to 'now' >> 06:58:50 node03 pengine[3803]: crit: get_timet_now: Defaulting to 'now' >> 06:58:50 node03 pengine[3803]: warning: unpack_rsc_op: Processing failed op >> monitor for test_agent:0 on node03: not running (7) >> 06:58:50 node03 pengine[3803]: notice: LogActions: Start >> test_agent:0#011(node02) >> 06:58:50 node03 pengine[3803]: notice: LogActions: Start >> test_agent:1#011(node03) >> 06:58:50 node03 pengine[3803]: notice: LogActions: Start >> test_agent:2#011(node01) >> 06:58:50 node03 pengine[3803]: notice: process_pe_message: Calculated >> Transition 919: /var/lib/pacemaker/pengine/pe-input-227.bz2 >> 06:58:50 node03 crmd[3804]: error: crm_xml_err: XML Error: Entity: line >> 1: parser error : invalid character in attribute value >> 06:58:50 node03 crmd[3804]: error: crm_xml_err: XML Error: >> :0:7063dfed-0c75-4608-9f1f-258e9874ad22" lrmd_rsc_output="Starting >> testservice: >> 06:58:50 node03 crmd[3804]: error: crm_xml_err: XML Error: >> ^ >> 06:58:50 node03 crmd[3804]: error: crm_xml_err: XML Error: Entity: line >> 1: parser error : attributes construct error >> 06:58:50 node03 crmd[3804]: error: crm_xml_err: XML Error: >> :0:7063dfed-0c75-4608-9f1f-258e9874ad22" lrmd_rsc_output="Starting >> testservice: >> 06:58:50 node03 crmd[3804]: error: crm_xml_err: XML Error: >> ^ >> 06:58:50 node03 crmd[3804]: error: crm_xml_err: XML Error: Entity: line >> 1: parser error : Couldn't find end of Start Tag lrmd_notify line 1 >> 06:58:50 node03 crmd[3804]: error: crm_xml_err: XML Error: >> :0:7063dfed-0c75-4608-9f1f-258e9874ad22" lrmd_rsc_output="Starting >> testservice: >> 06:58:50 node03 crmd[3804]: error: crm_xml_err: XML Error: >> ^ >> 06:58:50 node03 crmd[3804]: error: crm_xml_err: XML Error: Entity: line >> 1: parser error : Extra content at the end of the document >> 06:58:50 node03 crmd[3804]: error: crm_xml_err: XML Error: >> :0:7063dfed-0c75-4608-9f1f-258e9874ad22" lrmd_rsc_output="Starting >> testservice: >> 06:58:50 node03 crmd[3804]: error: crm_xml_err: XML Error: >> ^ >> 06:58:50 node03 crmd[3804]: warning: string2xml: Parsing failed (domain=1, >> level=3, code=5): Extra content at the end of the document >> 06:58:50 node03 crmd[3804]: warning: string2xml: String start: <lrmd_notify >> lrmd_origin="send_cmd_complete_notify >> 06:58:50 node03 crmd[3804]: warning: string2xml: String start+692: >> e="start" CRM_meta_timeout="60000"/></lrmd_notify> >> 06:58:50 node03 crmd[3804]: error: crm_abort: string2xml: Forked child >> 18986 to record non-fatal assert at xml.c:605 : String parsing error >> 06:58:51 node03 abrt[18987]: Saved core dump of pid 18986 >> (/usr/libexec/pacemaker/crmd) to >> /var/spool/abrt/ccpp-2013-04-24-06:58:50-18986 (18911232 bytes) >> 06:58:51 node03 abrtd: Directory 'ccpp-2013-04-24-06:58:50-18986' creation >> detected >> 06:58:51 node03 crmd[3804]: notice: process_lrm_event: LRM operation >> test_agent_start_0 (call=611, rc=0, cib-update=1347, confirmed=true) ok >> 06:58:55 node03 abrtd: Sending an email... >> 06:58:56 node03 abrtd: Email was sent to: root@localhost >> 06:58:56 node03 abrtd: Duplicate: UUID >> 06:58:56 node03 abrtd: DUP_OF_DIR: >> /var/spool/abrt/ccpp-2013-04-16-06:24:40-3992 >> 06:58:56 node03 abrtd: Problem directory is a duplicate of >> /var/spool/abrt/ccpp-2013-04-16-06:24:40-3992 >> 06:58:56 node03 abrtd: Deleting problem directory >> ccpp-2013-04-24-06:58:50-18986 (dup of ccpp-2013-04-16-06:24:40-3992) >> 06:58:58 node03 crmd[3804]: notice: process_lrm_event: LRM operation >> test_agent_monitor_10000 (call=330, rc=0, cib-update=1348, confirmed=false) >> ok >> 06:58:59 node03 crmd[3804]: notice: process_lrm_event: LRM operation >> test_agent_monitor_10000 (call=556, rc=0, cib-update=1351, confirmed=false) >> ok >> 06:59:00 node03 crmd[3804]: notice: process_lrm_event: LRM operation >> test_agent_monitor_10000 (call=473, rc=0, cib-update=1352, confirmed=false) >> ok >> >> >> >> >> The following pacemaker rpms are installed on the system. >> >> pacemaker-libs-1.1.8-7.el6.x86_64 >> pacemaker-1.1.8-7.el6.x86_64 >> pacemaker-cli-1.1.8-7.el6.x86_64 >> pacemaker-cluster-libs-1.1.8-7.el6.x86_64 >> >> >> _______________________________________________ >> Linux-HA mailing list >> [email protected] >> http://lists.linux-ha.org/mailman/listinfo/linux-ha >> See also: http://linux-ha.org/ReportingProblems > > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
