Hi, With the reference to current Message: 2, I've updated my heartbeat 2.1.3-3 to 2.99 and also installed updated version of pacemaker i.e. 1.0.4. I'm facing problem when I edit/allocate resources using cibadmin -Q > a.xml and replace the file using this command cibadmin -R -x a.xml, it gives me an error as listed down below;
Call cib_replace failed (-47): Update does not conform to the configured schema/DTD <null> When I verify the file using crm_verfiy -x a.xml, it shows followinh errors as listed below; a.xml:19: element attributes: Relax-NG validity error : Element instance_attributes has extra content: attributes Relax-NG validity error : Extra element instance_attributes in interleave a.xml:18: element instance_attributes: Relax-NG validity error : Element primitive failed to validate content a.xml:14: element primitive: Relax-NG validity error : Element resources has extra content: primitive a.xml:2: element configuration: Relax-NG validity error : Invalid sequence in interleave a.xml:1: element cib: Relax-NG validity error : Element cib failed to validate content crm_verify[3524]: 2009/07/29_16:11:54 ERROR: main: CIB did not pass DTD/schema validation Errors found during check: config not valid I was using this resource configuration (a.xml) in heartbeat 2.1.3, and on that time I wasn't not facing any errors. I'm attaching my configuration files, kindly reply it soon. Regards, Ahmed Munir On Tue, Jul 28, 2009 at 7:12 PM, <[email protected]>wrote: > Send Linux-HA mailing list submissions to > [email protected] > > To subscribe or unsubscribe via the World Wide Web, visit > http://lists.linux-ha.org/mailman/listinfo/linux-ha > or, via email, send a message with subject or body 'help' to > [email protected] > > You can reach the person managing the list at > [email protected] > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Linux-HA digest..." > > > Today's Topics: > > 1. Re: get attribute value from commandline (Andrew Beekhof) > 2. Re: Linux-HA Digest, Vol 68, Issue 56 (Andrew Beekhof) > 3. Re: CRM issues (Andrew Beekhof) > 4. problems in adding time based rule to IPaddr resource > (abhishek agrawal) > 5. Re: stand_alone_ping: Node xx.yy.zz.ww is unreachable (read) > ([email protected]) > 6. HA cluster 2 IP Service and bonding (Miguel Olivares) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Tue, 28 Jul 2009 13:52:23 +0200 > From: Andrew Beekhof <[email protected]> > Subject: Re: [Linux-HA] get attribute value from commandline > To: General Linux-HA mailing list <[email protected]> > Message-ID: > <[email protected]> > Content-Type: text/plain; charset=ISO-8859-1 > > Newer versions allow xpath queries. In your case, you'd run: > cibadmin --query --xpath "//nvpa...@name='pingd']" > > On Tue, Jul 28, 2009 at 6:53 AM, MAHESH, SIDDACHETTY M (SIDDACHETTY > M)<[email protected]> wrote: > > Hi, > > > > I need to find out what the attribute value and calculated score is for a > resource. Is it possible to get this info from the command line using some > utility? > > > > > > For example, my cib.xml has this entry > > > > <rsc_location id="ipaddress_connected" rsc="ip_group"> > > ? ? ? ? <rule id="ipaddress_connected_rule" score="-INFINITY" > boolean_op="or"> > > ? ? ? ? ? <expression id="ipaddress_connected_rule_expr_undefined" > attribute="pingd" operation="not_defined"/> > > ? ? ? ? ? <expression id="ipaddress_connected_rule_expr_zero" > attribute="pingd" operation="lte" value="0"/> > > ? ? ? ? </rule> > > </rsc_location> > > > > > > Is it possible to get the value of the 'pingd' attribute? Also, is it > possible to determine what the calculated score is for the 'ip_group' > resource? I tried to determine the failure count using the 'crm_failcount' > utility but it always reports a value of '1' even on multiple failures of > the 'ip_group' resource ('crm_failcount -V -G -r ip_group'). Is there a > means to detecting how many times a resource has failed? > > > > Thanks, > > Mahesh > > > > _______________________________________________ > > Linux-HA mailing list > > [email protected] > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > See also: http://linux-ha.org/ReportingProblems > > > > > ------------------------------ > > Message: 2 > Date: Tue, 28 Jul 2009 13:53:50 +0200 > From: Andrew Beekhof <[email protected]> > Subject: Re: [Linux-HA] Linux-HA Digest, Vol 68, Issue 56 > To: General Linux-HA mailing list <[email protected]> > Message-ID: > <[email protected]> > Content-Type: text/plain; charset=ISO-8859-1 > > On Mon, Jul 27, 2009 at 5:47 AM, Ahmed Munir<[email protected]> > wrote: > > Thanks for replying Mr. Andrew Beekhof. > > > > With reference to Message:1, I'm using Centos 5.3 Linux and the version > for > > heartbeat I'm using is 2.1.3-3 > > > > Kindly let me know if there is a bug in this version > > 2.1.3 was quite some time ago, there are definitely bugs in it > > > and also do please > > mention how to fix it. > > Update to a recent version of pacemaker > > > ------------------------------ > > Message: 3 > Date: Tue, 28 Jul 2009 13:57:38 +0200 > From: Andrew Beekhof <[email protected]> > Subject: Re: [Linux-HA] CRM issues > To: General Linux-HA mailing list <[email protected]> > Message-ID: > <[email protected]> > Content-Type: text/plain; charset=ISO-8859-1 > > On Wed, Jul 8, 2009 at 5:58 PM, Bret E. Palsson<[email protected]> wrote: > > All was done in reference to > http://clusterlabs.org/mediawiki/images/8/8d/Crm_cli.pdf ? pages 3 and 4. > > > > When pasting the following I get errors. However if I enter the crm and > paste line by line I don't get errors and everything works dandy. Any > suggestions on how I can "script" this configuration? > > It should work, but Dejan's the one that understands this stuff (and > he's on vacation). > I wonder if its an escaping thing, perhaps $id is being expanded... > > Then again, maybe its a bug thats been fixed since. What version are > you running? > > > > > I've also tried: crm configure show > backup ?(Obviously after having a > valid configuration to backup) > > > > and then pasted the contents of the backup after the erase command below. > That didn't work either. > > > > crm<<EOF > > configure > > erase > > primitive virtual_ip ocf:heartbeat:IPaddr2 operations > $id="virtual_ip-operations" op monitor interval="10s" timeout="20s" > start-delay="5s" params ip="10.130.0.5" nic="eth0" cidr_netmask="16" meta > $id="virtual_ip-meta_attributes" > > primitive pgpool-ha ocf:pacemaker:pgpoolha operations > $id="pgpool-ha-operations" op monitor interval="10" timeout="20" > start-delay="0" params pgpool="/usr/bin/pgpool" > pgpoolconf="/etc/pgpool.conf" pcpconf="/etc/pcp.conf" > pool_hbaconf="/etc/pool_hba.conf" forcestop="10" meta > $id="pgpool-ha-meta_attributes" > > primitive citrix-stonith stonith:external/citrix-xenserver operations > $id="citrix-stonith-operations" op monitor interval="15" timeout="15" > start-delay="15" params hostlist="dbcontroller1.net: > dbcontroller1.master,dbcontroller2.net:dbcontroller2.master" > poolMaster="10.128.250.1" poolMasterUserName="root" > poolMasterPassword="nada" meta $id="citrix-stonith-meta_attributes" > > group pgpool-ha-group virtual_ip pgpool-ha meta target-role="started" > > clone stone-citrix citrix-stonith meta target-role="started" > > property $id="cib-bootstrap-options" expected-quorum-votes="2" > no-quorum-policy="ignore" stonith-timeout="30s" default-action-timeout="30s" > cluster-delay="30s" > > rsc_defaults $id="rsc_defaults-options" resource-stickiness="INFINITY" > > commit > > EOF > > > > OUTPUT: > > element nvpair: Relax-NG validity error : Expecting element op, got > nvpair > > Relax-NG validity error : Extra element operations in interleave > > element operations: Relax-NG validity error : Element primitive failed to > validate content > > element group: Relax-NG validity error : Invalid sequence in interleave > > element group: Relax-NG validity error : Element group failed to validate > content > > element cib: Relax-NG validity error : Element cib failed to validate > content > > crm_verify[3180]: 2009/07/08_02:02:08 ERROR: main: CIB did not pass > DTD/schema validation > > Errors found during check: config not valid > > WARNING: 10: crm_verify(8) found errors in the CIB > > INFO: 10: use commit force if you know what you are doing > > Call cib_modify failed (-47): Update does not conform to the configured > schema/DTD > > <null> > > _______________________________________________ > > Linux-HA mailing list > > [email protected] > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > See also: http://linux-ha.org/ReportingProblems > > > > > ------------------------------ > > Message: 4 > Date: Tue, 28 Jul 2009 18:29:29 +0530 > From: abhishek agrawal <[email protected]> > Subject: [Linux-HA] problems in adding time based rule to IPaddr > resource > To: [email protected] > Message-ID: > <[email protected]> > Content-Type: text/plain; charset=ISO-8859-1 > > i was trying to add a simple rule to achieve time dependent target-role. > before addition f rule the following CIB was working. > <cib validate-with="pacemaker-1.0" crm_feature_set="3.0.1" have-quorum="0" > admin_epoch="0" epoch="34" num_updates="5" cib-last-written="Sat Jul 25 > 22:38:00 2009" dc-uuid="7d28a8e7-3948-42af-8308-5275972f2e2a"> > <configuration> > <crm_config> > <cluster_property_set id="cib-bootstrap-options"> > <nvpair id="cib-bootstrap-options-dc-version" name="dc-version" > value="1.0.4-6dede86d6105786af3a5321ccf66b44b6914f0aa"/> > <nvpair id="cib-bootstrap-options-cluster-infrastructure" > name="cluster-infrastructure" value="Heartbeat"/> > <nvpair id="cib-bootstrap-options-stonith-enabled" > name="stonith-enabled" value="false"/> > <nvpair id="cib-bootstrap-options-no-quorum-policy" > name="no-quorum-policy" value="ignore"/> > </cluster_property_set> > </crm_config> > <nodes> > <node id="7d28a8e7-3948-42af-8308-5275972f2e2a" uname="kf-cent-dm2" > type="normal"/> > <node id="9d0d3088-b98a-4bc0-a8da-c500176a799c" uname="kf-cent-dm1" > type="normal"/> > </nodes> > <resources> > <primitive class="ocf" id="failover-ip" provider="heartbeat" > type="IPaddr"> > <instance_attributes id="failover-ip-instance_attributes"> > <nvpair id="failover-ip-instance_attributes-ip" name="ip" > value="15.154.59.49"/> > </instance_attributes> > <operations> > <op id="failover-ip-monitor-5s" interval="5s" name="monitor"/> > </operations> > <meta_attributes id="core-hours" score="10"> > <nvpair id="core-hour-role" name="target-role" value="started"/> > </meta_attributes> > <meta_attributes id="after-hours" score="5"> > <nvpair id="after-hour-role" name="target-role" value="stopped"/> > </meta_attributes> > </primitive> > </resources> > <constraints/> > <rsc_defaults/> > <op_defaults/> > </configuration> > > I was trying to add following rule: > > <rule id="core-hour-rule"> > <date_expression id="9to5" operation="date_spec"> > <date_spec hours="9-17"/> > </date_expression> > </rule> > > so my modified cib.xml look like following: > > <cib validate-with="pacemaker-1.0" crm_feature_set="3.0.1" have-quorum="0" > admin_epoch="0" epoch="34" num_updates="5" cib-last-written="Sat Jul 25 > 22:38:00 2009" dc-uuid="7d28a8e7-3948-42af-8308-5275972f2e2a"> > <configuration> > <crm_config> > <cluster_property_set id="cib-bootstrap-options"> > <nvpair id="cib-bootstrap-options-dc-version" name="dc-version" > value="1.0.4-6dede86d6105786af3a5321ccf66b44b6914f0aa"/> > <nvpair id="cib-bootstrap-options-cluster-infrastructure" > name="cluster-infrastructure" value="Heartbeat"/> > <nvpair id="cib-bootstrap-options-stonith-enabled" > name="stonith-enabled" value="false"/> > <nvpair id="cib-bootstrap-options-no-quorum-policy" > name="no-quorum-policy" value="ignore"/> > </cluster_property_set> > </crm_config> > <nodes> > <node id="7d28a8e7-3948-42af-8308-5275972f2e2a" uname="kf-cent-dm2" > type="normal"/> > <node id="9d0d3088-b98a-4bc0-a8da-c500176a799c" uname="kf-cent-dm1" > type="normal"/> > </nodes> > <resources> > <primitive class="ocf" id="failover-ip" provider="heartbeat" > type="IPaddr"> > <instance_attributes id="failover-ip-instance_attributes"> > <nvpair id="failover-ip-instance_attributes-ip" name="ip" > value="15.154.59.49"/> > </instance_attributes> > <operations> > <op id="failover-ip-monitor-5s" interval="5s" name="monitor"/> > </operations> > <meta_attributes id="core-hours" score="10"> > <rule id="core-hour-rule"> > <date_expression id="9to5" operation="date_spec"> > <date_spec hours="9-17"/> > </date_expression> > </rule> > <nvpair id="core-hour-role" name="target-role" value="started"/> > </meta_attributes> > <meta_attributes id="after-hours" score="5"> > <nvpair id="after-hour-role" name="target-role" value="stopped"/> > </meta_attributes> > </primitive> > </resources> > <constraints/> > <rsc_defaults/> > <op_defaults/> > </configuration> > > But when i try to replace this file it says: > > Update does not conform to the configured schema/DTD > <null> > > can anyone tell where is the mistake. > > --abhishek > > > ------------------------------ > > Message: 5 > Date: Tue, 28 Jul 2009 16:37:22 +0200 > From: [email protected] > Subject: Re: [Linux-HA] stand_alone_ping: Node xx.yy.zz.ww is > unreachable (read) > To: [email protected], General Linux-HA mailing list > <[email protected]> > Message-ID: > < > q273266430-31c6ba17533bb4e341c9e10f04944...@pmq4.mod5.onet.test.onet.pl> > > Content-Type: text/plain; charset=iso-8859-2 > > Does anybody have a clue what is going on - is this a bug or a real problem > with connection that is not notified by system ping. > > Is there any way to replace pingd with e.g. a bash script that check the > connectivity and reports a connection status to the heartbeat system (e.g. > resource is stopped or resource has failure)? So a score for a certain > resource group is recalculated and in the case of connectivity problems the > resource group is relocated to other machine. Is this practically possible > to apply to the crm style configuration? > > E.g. bash subroutine: > > check_connection () { > ?node=$1 > ?[ -z "$node" ] && return 1 > ?NPACKETS=3 > ?stat=0 > ?ping -n -q -c $NPACKETS "$node" >/dev/null 2>&1 > ?if [ "$?" -ne 0 ]; then > ??echo "ERROR: Ping node $node does not answer to ICMP pings" > ??stat=1 > ?else > ? echo "INFO: Ping node $node answers to ICMP pings" > ?fi > ?return $stat > } > > > I would be grateful for help, > > Jarek > > > "General Linux-HA mailing list" <[email protected]> napisa?(a): > > > > I found additionally the error message attached below. Please advise. > > > > Thanks > > Jarek > > > > pingd[6890]: 2009/07/24_14:47:15 debug: stand_alone_ping: Node 3.27.60.1 > is alive > > pingd[6890]: 2009/07/24_14:47:15 debug: debug2: ping_close: Closed > connection to 3.27.60.1 > > pingd[6890]: 2009/07/24_14:47:15 debug: send_update: Sent update: > pingd=1000 (1 active ping nodes) > > pingd[6890]: 2009/07/24_14:47:16 debug: debug2: stand_alone_ping: > Checking connectivity > > pingd[6890]: 2009/07/24_14:47:16 debug: debug2: ping_open: Got address > 3.27.60.1 for 3.27.60.1 > > pingd[6890]: 2009/07/24_14:47:16 debug: debug2: ping_open: Opened > connection to 3.27.60.1 > > pingd[6890]: 2009/07/24_14:47:16 debug: debug2: ping_write: Sent 39 > bytes to 3.27.60.1 > > pingd[6890]: 2009/07/24_14:47:16 debug: debug2: ping_read: Got 59 bytes > > No error message: -1: Resource temporarily unavailable (11) > > pingd[6890]: 2009/07/24_14:47:16 debug: process_icmp_error: No error > message: -1: Resource temporarily unavailable (11) > > pingd[6890]: 2009/07/24_14:47:16 debug: debug2: dump_v4_echo: Echo from > 3.27.60.1 (exp=1238, seq=18367, id=11669, dest=3. > > 27.60.1, data=pingd-v4): Echo Reply > > pingd[6890]: 2009/07/24_14:47:16 info: stand_alone_ping: Node 3.27.60.1 > is unreachable (read) > > pingd[6890]: 2009/07/24_14:47:17 debug: debug2: ping_write: Sent 39 > bytes to 3.27.60.1 > > pingd[6890]: 2009/07/24_14:47:17 debug: debug2: ping_read: Got 59 bytes > > No error message: -1: Resource temporarily unavailable (11) > > pingd[6890]: 2009/07/24_14:47:17 debug: process_icmp_error: No error > message: -1: Resource temporarily unavailable (11) > > pingd[6890]: 2009/07/24_14:47:17 debug: debug2: dump_v4_echo: Echo from > 3.27.60.1 (exp=1239, seq=1238, id=6890, dest=3.27 > > .60.1, data=pingd-v4): Echo Reply > > pingd[6890]: 2009/07/24_14:47:17 info: stand_alone_ping: Node 3.27.60.1 > is unreachable (read) > > pingd[6890]: 2009/07/24_14:47:18 debug: debug2: ping_close: Closed > connection to 3.27.60.1 > > pingd[6890]: 2009/07/24_14:47:18 debug: send_update: Sent update: > pingd=0 (0 active ping nodes) > > pingd[6890]: 2009/07/24_14:47:18 debug: debug2: stand_alone_ping: > Checking connectivity > > pingd[6890]: 2009/07/24_14:47:18 debug: debug2: ping_open: Got address > 3.27.60.1 for 3.27.60.1 > > pingd[6890]: 2009/07/24_14:47:18 debug: debug2: ping_open: Opened > connection to 3.27.60.1 > > pingd[6890]: 2009/07/24_14:47:18 debug: debug2: ping_write: Sent 39 > bytes to 3.27.60.1 > > pingd[6890]: 2009/07/24_14:47:18 debug: debug2: ping_read: Got 59 bytes > > pingd[6890]: 2009/07/24_14:47:18 debug: debug2: dump_v4_echo: Echo from > 3.27.60.1 (exp=1240, seq=1240, id=6890, dest=3.27 > > .60.1, data=pingd-v4): Echo Reply > > pingd[6890]: 2009/07/24_14:47:18 debug: stand_alone_ping: Node 3.27.60.1 > is alive > > pingd[6890]: 2009/07/24_14:47:18 debug: debug2: ping_close: Closed > connection to 3.27.60.1 > > p > > > > "General Linux-HA mailing list" <[email protected]> > napisa?(a): > > > Below is part of the output with error message produced by command: > > > /usr/lib64/heartbeat/pingd -VVV -a pingd -d 10 -m 1000 -h 3.27.60.1 > > > > > > The machine has three network interfaces and is connected to three > different subnets (3.27.x.x, 192.168.x.x - cluster subnet, 172.22.x.x - > dedicated for heartbeat). > > > > > > pingd[6890]: 2009/07/24_14:44:36 debug: debug2: ping_close: Closed > connection to 3.27.60.1 > > > pingd[6890]: 2009/07/24_14:44:36 debug: send_update: Sent update: > pingd=1000 (1 active ping nodes) > > > pingd[6890]: 2009/07/24_14:44:37 debug: debug2: stand_alone_ping: > Checking connectivity > > > pingd[6890]: 2009/07/24_14:44:37 debug: debug2: ping_open: Got > address 3.27.60.1 for 3.27.60.1 > > > pingd[6890]: 2009/07/24_14:44:37 debug: debug2: ping_open: Opened > connection to 3.27.60.1 > > > pingd[6890]: 2009/07/24_14:44:37 debug: debug2: ping_write: Sent 39 > bytes to 3.27.60.1 > > > pingd[6890]: 2009/07/24_14:44:38 debug: debug2: ping_read: Got 59 > bytes > > > pingd[6890]: 2009/07/24_14:44:38 debug: debug2: dump_v4_echo: Echo > from 3.27.60.1 (exp=1080, seq=1080, id=6890, dest=3.27.60.1, data=pingd-v4): > Echo Reply > > > pingd[6890]: 2009/07/24_14:44:38 debug: stand_alone_ping: Node > 3.27.60.1 is alive > > > pingd[6890]: 2009/07/24_14:44:38 debug: debug2: ping_close: Closed > connection to 3.27.60.1 > > > pingd[6890]: 2009/07/24_14:44:38 debug: send_update: Sent update: > pingd=1000 (1 active ping nodes) > > > pingd[6890]: 2009/07/24_14:44:38 debug: debug2: stand_alone_ping: > Checking connectivity > > > pingd[6890]: 2009/07/24_14:44:38 debug: debug2: ping_open: Got > address 3.27.60.1 for 3.27.60.1 > > > pingd[6890]: 2009/07/24_14:44:38 debug: debug2: ping_open: Opened > connection to 3.27.60.1 > > > pingd[6890]: 2009/07/24_14:44:38 debug: debug2: ping_write: Sent 39 > bytes to 3.27.60.1 > > > pingd[6890]: 2009/07/24_14:44:39 debug: debug2: ping_read: Got 262 > bytes > > > No error message: -1: Resource temporarily unavailable (11) > > > pingd[6890]: 2009/07/24_14:44:39 debug: process_icmp_error: No error > message: -1: Resource temporarily unavailable (11) > > > pingd[6890]: 2009/07/24_14:44:39 debug: debug2: dump_v4_echo: Echo > from 172.22.10.2 (exp=1081, seq=0, id=0, dest=3.27.60.1, data=E?): > Unreachable Port > > > pingd[6890]: 2009/07/24_14:44:39 info: stand_alone_ping: Node > 3.27.60.1 is unreachable (read) > > > pingd[6890]: 2009/07/24_14:44:40 debug: debug2: ping_write: Sent 39 > bytes to 3.27.60.1 > > > pingd[6890]: 2009/07/24_14:44:40 debug: debug2: ping_read: Got 262 > bytes > > > No error message: -1: Resource temporarily unavailable (11) > > > pingd[6890]: 2009/07/24_14:44:40 debug: process_icmp_error: No error > message: -1: Resource temporarily unavailable (11) > > > pingd[6890]: 2009/07/24_14:44:40 debug: debug2: dump_v4_echo: Echo > from 192.168.0.5 (exp=1082, seq=0, id=0, dest=3.27.60.1, data=E?): > Unreachable Port > > > pingd[6890]: 2009/07/24_14:44:40 info: stand_alone_ping: Node > 3.27.60.1 is unreachable (read) > > > pingd[6890]: 2009/07/24_14:44:41 debug: debug2: ping_close: Closed > connection to 3.27.60.1 > > > pingd[6890]: 2009/07/24_14:44:41 debug: send_update: Sent update: > pingd=0 (0 active ping nodes) > > > pingd[6890]: 2009/07/24_14:44:41 debug: debug2: stand_alone_ping: > Checking connectivity > > > pingd[6890]: 2009/07/24_14:44:41 debug: debug2: ping_open: Got > address 3.27.60.1 for 3.27.60.1 > > > pingd[6890]: 2009/07/24_14:44:41 debug: debug2: ping_open: Opened > connection to 3.27.60.1 > > > pingd[6890]: 2009/07/24_14:44:41 debug: debug2: ping_write: Sent 39 > bytes to 3.27.60.1 > > > pingd[6890]: 2009/07/24_14:44:41 debug: debug2: ping_read: Got 59 > bytes > > > pingd[6890]: 2009/07/24_14:44:41 debug: debug2: dump_v4_echo: Echo > from 3.27.60.1 (exp=1083, seq=1083, id=6890, dest=3.27.60.1, data=pingd-v4): > Echo Reply > > > pingd[6890]: 2009/07/24_14:44:41 debug: stand_alone_ping: Node > 3.27.60.1 is alive > > > pingd[6890]: 2009/07/24_14:44:41 debug: debug2: ping_close: Closed > connection to 3.27.60.1 > > > pingd[6890]: 2009/07/24_14:44:41 debug: send_update: Sent update: > pingd=1000 (1 active ping nodes) > > > > > > Thanks > > > Jarek > > > > > > "General Linux-HA mailing list" <[email protected]> > napisa?(a): > > > > 2009/7/24 <[email protected]>: > > > > > > > > > > Rpm built for RHEL5: > > > > > heartbeat-common-2.99.2-8.1 > > > > > libheartbeat2-2.99.2-8.1 > > > > > heartbeat-2.99.2-8.1 > > > > > heartbeat-resources-2.99.2-8.1 > > > > > pacemaker-1.0.3-2.2 > > > > > pacemaker-mgmt-client-1.99.1-2.1 > > > > > libpacemaker3-1.0.3-2.2 > > > > > pacemaker-mgmt-1.99.1-2.1 > > > > > > > > > > If i start pingd manually (beside working heartbeat+pacemaker) > it gives me following when in /var/log/ha-debug appears "stand_alone_ping: > Node xx.yy.zz.ww is unreachable (read)": > > > > > > > > > > [r...@gate2]# date ;/usr/lib64/heartbeat/pingd -a pingd -d 10 > -m 1000 -h xx.yy.zz.ww; date > > > > > Thu Jul 23 19:25:24 CEST 2009 > > > > > No error message: -1: Resource temporarily unavailable (11) > > > > > No error message: -1: Resource temporarily unavailable (11) > > > > > No error message: -1: Resource temporarily unavailable (11) > > > > > No error message: -1: Resource temporarily unavailable (11) > > > > > No error message: -1: Resource temporarily unavailable (11) > > > > > No error message: -1: Resource temporarily unavailable (11) > > > > > ... > > > > > > > > > > System ping reports no errors. > > > > > > > > > > > > > If you repeat that test with some extra -V arguments, you should > see > > > > more information (which would be helpful). > > > > But its pretty clear there must be a bug, so its probably worth > > > > creating an entry in bugzilla. > > > > _______________________________________________ > > > > Linux-HA mailing list > > > > [email protected] > > > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > > > See also: http://linux-ha.org/ReportingProblems > > > > > > _______________________________________________ > > > Linux-HA mailing list > > > [email protected] > > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > > See also: http://linux-ha.org/ReportingProblems > > > > _______________________________________________ > > Linux-HA mailing list > > [email protected] > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > See also: http://linux-ha.org/ReportingProblems > > > > ------------------------------ > > Message: 6 > Date: Tue, 28 Jul 2009 16:55:21 +0200 > From: Miguel Olivares <[email protected]> > Subject: [Linux-HA] HA cluster 2 IP Service and bonding > To: [email protected] > Message-ID: <[email protected]> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > Hello, > > I have 2 servers in mi cluster with 4 Ethernet cards each one, i used > bonding in order to have full redundancy. i didn't have any problem with > my configuration, it works but not as well as i expect. What i want to > do is when one of the two bonding interfaces "bond0" or "bond1" does not > respond in the primary server this servers gives up, and the other > server can take the service, but i don't know how to do that. i put my > config files ha.cf and haresources. > > eth0 and eth1 -> bond0 #on Server1 and Server2 > eth2 and eth3 -> bond1 #on Server1 and Server2 > > 192.168.1.10 Server1 # bond0 > 192.168.1.11 Server2 # bond0 > 192.168.1.12 virtual IP1 > > 172.16.10.10 Server1 #bond1 > 172.16.10 11 Server2 #bond1 > 172.16.10.12 virtual IP2 > > in my configuration i saw both virtual's IP in Server1, but when i put > down bond1 Server1 still continue as the primary node. > does anybody help me > > Thanks > > regards. > > > [ha.cf] > debugfile /var/log/ha-debug > logfile /var/log/ha-log > keepalive 2 > deadtime 30 > warntime 10 > initdead 60 > udpport 694 > bcast bond0 bond1 > auto_failback off > node Server1 Server2 > ping 192.168.1.254 > ping 172.16.10.254 > > > [haresources] > Server1 192.168.1.12 172.16.1.12 > > > ------------------------------ > > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > > End of Linux-HA Digest, Vol 68, Issue 63 > **************************************** > -- Regards, Ahmed Munir
<cib epoch="7" num_updates="3" admin_epoch="0" validate-with="pacemaker-1.0" crm_feature_set="3.0.1" have-quorum="1" cib-last-written="Wed Jul 29 15:27:45 2009" dc-uuid="70503c2e-bb4a-48f8-aab3-53696656a4d0">
<configuration>
<crm_config>
<cluster_property_set id="cib-bootstrap-options">
<nvpair id="cib-bootstrap-options-dc-version" name="dc-version" value="1.0.4-6dede86d6105786af3a5321ccf66b44b6914f0aa"/>
<nvpair id="cib-bootstrap-options-cluster-infrastructure" name="cluster-infrastructure" value="Heartbeat"/>
</attributes>
</cluster_property_set>
</crm_config>
<nodes>
<node id="70503c2e-bb4a-48f8-aab3-53696656a4d0" uname="ha2" type="normal"/>
<node id="e651c120-b9a1-489a-baf7-caf0028ad540" uname="ha1" type="normal"/>
</nodes>
<resources>
<primitive class="ocf" provider="heartbeat" type="IPaddr" id="IPaddr_1">
<operations>
<op id="IPaddr_1_mon" interval="10s" name="monitor" timeout="8s"/>
</operations>
<instance_attributes id="IPaddr_1_inst_attr">
<attributes>
<nvpair name="ip" value="192.168.0.184" id="IPaddr_1_machine_1"/>
</attributes>
</instance_attributes>
</primitive>
<primitive class="ocf" provider="heartbeat" type="IPaddr" id="IPaddr_2">
<operations>
<op id="IPaddr_2_mon" interval="10s" name="monitor" timeout="8s"/>
</operations>
<instance_attributes id="IPaddr_2_inst_attr">
<attributes>
<nvpair name="ip" value="192.168.0.185" id="IPaddr_2_machine_2"/>
</attributes>
</instance_attributes>
</primitive>
</resources>
<constraints>
<rsc_location id="rsc_location_IPaddr_1" rsc="IPaddr_1">
<rule id="prefered_location_IPaddr_1" score="200">
<expression attribute="#uname" id="prefered_location_IPaddr_1_expr" operation="eq" value="ha1"/>
</rule>
</rsc_location>
<rsc_location id="rsc_location_IPaddr_2" rsc="IPaddr_2">
<rule id="prefered_location_IPaddr_2" score="200">
<expression attribute="#uname" id="prefered_location_IPaddr_2_expr" operation="eq" value="ha2"/>
</rule>
</rsc_location>
<rsc_location id="my1_resource1:connected" rsc="IPaddr_1">
<rule id="my1_resource1:connected:rule" score_attribute="pingd">
<expression id="my1_resource1:connected:expr:defined" attribute="pingd" operation="defined"/>
</rule>
</rsc_location>
<rsc_location id="my2_resource2:connected" rsc="IPaddr_2">
<rule id="my2_resource2:connected:rule" score_attribute="pingd">
<expression id="my2_resource2:connected:expr:defined" attribute="pingd" operation="defined"/>
</rule>
</rsc_location>
</constraints>
</configuration>
<status>
<node_state id="70503c2e-bb4a-48f8-aab3-53696656a4d0" uname="ha2" ha="active" in_ccm="true" crmd="online" join="member" expected="member" crm-debug-origin="do_state_transition" shutdown="0">
<lrm id="70503c2e-bb4a-48f8-aab3-53696656a4d0">
<lrm_resources/>
</lrm>
<transient_attributes id="70503c2e-bb4a-48f8-aab3-53696656a4d0">
<instance_attributes id="status-70503c2e-bb4a-48f8-aab3-53696656a4d0">
<nvpair id="status-70503c2e-bb4a-48f8-aab3-53696656a4d0-probe_complete" name="probe_complete" value="true"/>
<nvpair id="status-70503c2e-bb4a-48f8-aab3-53696656a4d0-pingd" name="pingd" value="100"/>
</instance_attributes>
</transient_attributes>
</node_state>
</status>
</cib>
ha.cf
Description: Binary data
_______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
