Re: [Linux-HA] Linux-HA Digest, Vol 68, Issue 63

Ahmed Munir Wed, 29 Jul 2009 04:02:08 -0700

Hi,

With the reference to current Message: 2, I've updated my heartbeat 2.1.3-3
to 2.99 and also installed updated version of pacemaker i.e. 1.0.4. I'm
facing problem when I  edit/allocate resources using cibadmin -Q > a.xml and
replace the file using this command cibadmin -R -x a.xml, it gives me an
error as listed down below;


Call cib_replace failed (-47): Update does not conform to the configured
schema/DTD
<null>

When I verify the file using crm_verfiy -x a.xml, it shows followinh errors
as listed below;

a.xml:19: element attributes: Relax-NG validity error : Element
instance_attributes has extra content: attributes
Relax-NG validity error : Extra element instance_attributes in interleave
a.xml:18: element instance_attributes: Relax-NG validity error : Element
primitive failed to validate content
a.xml:14: element primitive: Relax-NG validity error : Element resources has
extra content: primitive
a.xml:2: element configuration: Relax-NG validity error : Invalid sequence
in interleave
a.xml:1: element cib: Relax-NG validity error : Element cib failed to
validate content
crm_verify[3524]: 2009/07/29_16:11:54 ERROR: main: CIB did not pass
DTD/schema validation
Errors found during check: config not valid


I was using this resource configuration (a.xml) in heartbeat 2.1.3, and on
that time I wasn't not facing any errors.

I'm attaching my configuration files, kindly reply it soon.

Regards,
Ahmed Munir

On Tue, Jul 28, 2009 at 7:12 PM, <[email protected]>wrote:

> Send Linux-HA mailing list submissions to
>        [email protected]
>
> To subscribe or unsubscribe via the World Wide Web, visit
>        http://lists.linux-ha.org/mailman/listinfo/linux-ha
> or, via email, send a message with subject or body 'help' to
>        [email protected]
>
> You can reach the person managing the list at
>        [email protected]
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Linux-HA digest..."
>
>
> Today's Topics:
>
>   1. Re: get attribute value from commandline (Andrew Beekhof)
>   2. Re: Linux-HA Digest, Vol 68, Issue 56 (Andrew Beekhof)
>   3. Re: CRM issues (Andrew Beekhof)
>   4. problems in adding time based rule to IPaddr resource
>      (abhishek agrawal)
>   5. Re: stand_alone_ping: Node xx.yy.zz.ww is unreachable (read)
>      ([email protected])
>   6. HA cluster 2 IP Service and bonding (Miguel Olivares)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Tue, 28 Jul 2009 13:52:23 +0200
> From: Andrew Beekhof <[email protected]>
> Subject: Re: [Linux-HA] get attribute value from commandline
> To: General Linux-HA mailing list <[email protected]>
> Message-ID:
>        <[email protected]>
> Content-Type: text/plain; charset=ISO-8859-1
>
> Newer versions allow xpath queries. In your case, you'd run:
>  cibadmin --query --xpath "//nvpa...@name='pingd']"
>
> On Tue, Jul 28, 2009 at 6:53 AM, MAHESH, SIDDACHETTY M (SIDDACHETTY
> M)<[email protected]> wrote:
> > Hi,
> >
> > I need to find out what the attribute value and calculated score is for a
> resource. Is it possible to get this info from the command line using some
> utility?
> >
> >
> > For example, my cib.xml has this entry
> >
> > <rsc_location id="ipaddress_connected" rsc="ip_group">
> > ? ? ? ? <rule id="ipaddress_connected_rule" score="-INFINITY"
> boolean_op="or">
> > ? ? ? ? ? <expression id="ipaddress_connected_rule_expr_undefined"
> attribute="pingd" operation="not_defined"/>
> > ? ? ? ? ? <expression id="ipaddress_connected_rule_expr_zero"
> attribute="pingd" operation="lte" value="0"/>
> > ? ? ? ? </rule>
> > </rsc_location>
> >
> >
> > Is it possible to get the value of the 'pingd' attribute? Also, is it
> possible to determine what the calculated score is for the 'ip_group'
> resource? I tried to determine the failure count using the 'crm_failcount'
> utility but it always reports a value of '1' even on multiple failures of
> the 'ip_group' resource ('crm_failcount -V -G -r ip_group'). Is there a
> means to detecting how many times a resource has failed?
> >
> > Thanks,
> > Mahesh
> >
> > _______________________________________________
> > Linux-HA mailing list
> > [email protected]
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > See also: http://linux-ha.org/ReportingProblems
> >
>
>
> ------------------------------
>
> Message: 2
> Date: Tue, 28 Jul 2009 13:53:50 +0200
> From: Andrew Beekhof <[email protected]>
> Subject: Re: [Linux-HA] Linux-HA Digest, Vol 68, Issue 56
> To: General Linux-HA mailing list <[email protected]>
> Message-ID:
>        <[email protected]>
> Content-Type: text/plain; charset=ISO-8859-1
>
> On Mon, Jul 27, 2009 at 5:47 AM, Ahmed Munir<[email protected]>
> wrote:
> > Thanks for replying Mr. Andrew Beekhof.
> >
> > With reference to Message:1, I'm using Centos 5.3 Linux and the version
> for
> > heartbeat I'm using is 2.1.3-3
> >
> > Kindly let me know if there is a bug in this version
>
> 2.1.3 was quite some time ago, there are definitely bugs in it
>
> > and also do please
> > mention how to fix it.
>
> Update to a recent version of pacemaker
>
>
> ------------------------------
>
> Message: 3
> Date: Tue, 28 Jul 2009 13:57:38 +0200
> From: Andrew Beekhof <[email protected]>
> Subject: Re: [Linux-HA] CRM issues
> To: General Linux-HA mailing list <[email protected]>
> Message-ID:
>        <[email protected]>
> Content-Type: text/plain; charset=ISO-8859-1
>
> On Wed, Jul 8, 2009 at 5:58 PM, Bret E. Palsson<[email protected]> wrote:
> > All was done in reference to
> http://clusterlabs.org/mediawiki/images/8/8d/Crm_cli.pdf ? pages 3 and 4.
> >
> > When pasting the following I get errors. However if I enter the crm and
> paste line by line I don't get errors and everything works dandy. Any
> suggestions on how I can "script" this configuration?
>
> It should work, but Dejan's the one that understands this stuff (and
> he's on vacation).
> I wonder if its an escaping thing, perhaps $id is being expanded...
>
> Then again, maybe its a bug thats been fixed since.  What version are
> you running?
>
> >
> > I've also tried: crm configure show > backup ?(Obviously after having a
> valid configuration to backup)
> >
> > and then pasted the contents of the backup after the erase command below.
> That didn't work either.
> >
> > crm<<EOF
> > configure
> > erase
> > primitive virtual_ip ocf:heartbeat:IPaddr2 operations
> $id="virtual_ip-operations" op monitor interval="10s" timeout="20s"
> start-delay="5s" params ip="10.130.0.5" nic="eth0" cidr_netmask="16" meta
> $id="virtual_ip-meta_attributes"
> > primitive pgpool-ha ocf:pacemaker:pgpoolha operations
> $id="pgpool-ha-operations" op monitor interval="10" timeout="20"
> start-delay="0" params pgpool="/usr/bin/pgpool"
> pgpoolconf="/etc/pgpool.conf" pcpconf="/etc/pcp.conf"
> pool_hbaconf="/etc/pool_hba.conf" forcestop="10" meta
> $id="pgpool-ha-meta_attributes"
> > primitive citrix-stonith stonith:external/citrix-xenserver operations
> $id="citrix-stonith-operations" op monitor interval="15" timeout="15"
> start-delay="15" params hostlist="dbcontroller1.net:
> dbcontroller1.master,dbcontroller2.net:dbcontroller2.master"
> poolMaster="10.128.250.1" poolMasterUserName="root"
> poolMasterPassword="nada" meta $id="citrix-stonith-meta_attributes"
> > group pgpool-ha-group virtual_ip pgpool-ha meta target-role="started"
> > clone stone-citrix citrix-stonith meta target-role="started"
> > property $id="cib-bootstrap-options" expected-quorum-votes="2"
> no-quorum-policy="ignore" stonith-timeout="30s" default-action-timeout="30s"
> cluster-delay="30s"
> > rsc_defaults $id="rsc_defaults-options" resource-stickiness="INFINITY"
> > commit
> > EOF
> >
> > OUTPUT:
> > element nvpair: Relax-NG validity error : Expecting element op, got
> nvpair
> > Relax-NG validity error : Extra element operations in interleave
> > element operations: Relax-NG validity error : Element primitive failed to
> validate content
> > element group: Relax-NG validity error : Invalid sequence in interleave
> > element group: Relax-NG validity error : Element group failed to validate
> content
> > element cib: Relax-NG validity error : Element cib failed to validate
> content
> > crm_verify[3180]: 2009/07/08_02:02:08 ERROR: main: CIB did not pass
> DTD/schema validation
> > Errors found during check: config not valid
> > WARNING: 10: crm_verify(8) found errors in the CIB
> > INFO: 10: use commit force if you know what you are doing
> > Call cib_modify failed (-47): Update does not conform to the configured
> schema/DTD
> > <null>
> > _______________________________________________
> > Linux-HA mailing list
> > [email protected]
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > See also: http://linux-ha.org/ReportingProblems
> >
>
>
> ------------------------------
>
> Message: 4
> Date: Tue, 28 Jul 2009 18:29:29 +0530
> From: abhishek agrawal <[email protected]>
> Subject: [Linux-HA] problems in adding time based rule to IPaddr
>        resource
> To: [email protected]
> Message-ID:
>        <[email protected]>
> Content-Type: text/plain; charset=ISO-8859-1
>
> i was trying to add a simple rule to achieve time dependent target-role.
> before addition f rule the following CIB was working.
> <cib validate-with="pacemaker-1.0" crm_feature_set="3.0.1" have-quorum="0"
> admin_epoch="0" epoch="34" num_updates="5" cib-last-written="Sat Jul 25
> 22:38:00 2009" dc-uuid="7d28a8e7-3948-42af-8308-5275972f2e2a">
>  <configuration>
>    <crm_config>
>      <cluster_property_set id="cib-bootstrap-options">
>        <nvpair id="cib-bootstrap-options-dc-version" name="dc-version"
> value="1.0.4-6dede86d6105786af3a5321ccf66b44b6914f0aa"/>
>        <nvpair id="cib-bootstrap-options-cluster-infrastructure"
> name="cluster-infrastructure" value="Heartbeat"/>
>        <nvpair id="cib-bootstrap-options-stonith-enabled"
> name="stonith-enabled" value="false"/>
>        <nvpair id="cib-bootstrap-options-no-quorum-policy"
> name="no-quorum-policy" value="ignore"/>
>      </cluster_property_set>
>    </crm_config>
>    <nodes>
>      <node id="7d28a8e7-3948-42af-8308-5275972f2e2a" uname="kf-cent-dm2"
> type="normal"/>
>      <node id="9d0d3088-b98a-4bc0-a8da-c500176a799c" uname="kf-cent-dm1"
> type="normal"/>
>    </nodes>
>    <resources>
>      <primitive class="ocf" id="failover-ip" provider="heartbeat"
> type="IPaddr">
>        <instance_attributes id="failover-ip-instance_attributes">
>          <nvpair id="failover-ip-instance_attributes-ip" name="ip"
> value="15.154.59.49"/>
>        </instance_attributes>
>        <operations>
>          <op id="failover-ip-monitor-5s" interval="5s" name="monitor"/>
>        </operations>
>        <meta_attributes id="core-hours" score="10">
>          <nvpair id="core-hour-role" name="target-role" value="started"/>
>        </meta_attributes>
>        <meta_attributes id="after-hours" score="5">
>          <nvpair id="after-hour-role" name="target-role" value="stopped"/>
>        </meta_attributes>
>      </primitive>
>    </resources>
>    <constraints/>
>    <rsc_defaults/>
>    <op_defaults/>
>  </configuration>
>
> I was trying to add following rule:
>
> <rule id="core-hour-rule">
> <date_expression id="9to5" operation="date_spec">
> <date_spec hours="9-17"/>
> </date_expression>
> </rule>
>
> so my modified cib.xml look like following:
>
> <cib validate-with="pacemaker-1.0" crm_feature_set="3.0.1" have-quorum="0"
> admin_epoch="0" epoch="34" num_updates="5" cib-last-written="Sat Jul 25
> 22:38:00 2009" dc-uuid="7d28a8e7-3948-42af-8308-5275972f2e2a">
>  <configuration>
>    <crm_config>
>      <cluster_property_set id="cib-bootstrap-options">
>        <nvpair id="cib-bootstrap-options-dc-version" name="dc-version"
> value="1.0.4-6dede86d6105786af3a5321ccf66b44b6914f0aa"/>
>        <nvpair id="cib-bootstrap-options-cluster-infrastructure"
> name="cluster-infrastructure" value="Heartbeat"/>
>        <nvpair id="cib-bootstrap-options-stonith-enabled"
> name="stonith-enabled" value="false"/>
>        <nvpair id="cib-bootstrap-options-no-quorum-policy"
> name="no-quorum-policy" value="ignore"/>
>      </cluster_property_set>
>    </crm_config>
>    <nodes>
>      <node id="7d28a8e7-3948-42af-8308-5275972f2e2a" uname="kf-cent-dm2"
> type="normal"/>
>      <node id="9d0d3088-b98a-4bc0-a8da-c500176a799c" uname="kf-cent-dm1"
> type="normal"/>
>    </nodes>
>    <resources>
>      <primitive class="ocf" id="failover-ip" provider="heartbeat"
> type="IPaddr">
>        <instance_attributes id="failover-ip-instance_attributes">
>          <nvpair id="failover-ip-instance_attributes-ip" name="ip"
> value="15.154.59.49"/>
>        </instance_attributes>
>        <operations>
>          <op id="failover-ip-monitor-5s" interval="5s" name="monitor"/>
>        </operations>
>        <meta_attributes id="core-hours" score="10">
> <rule id="core-hour-rule">
> <date_expression id="9to5" operation="date_spec">
> <date_spec hours="9-17"/>
> </date_expression>
> </rule>
>          <nvpair id="core-hour-role" name="target-role" value="started"/>
>        </meta_attributes>
>        <meta_attributes id="after-hours" score="5">
>          <nvpair id="after-hour-role" name="target-role" value="stopped"/>
>        </meta_attributes>
>      </primitive>
>    </resources>
>    <constraints/>
>    <rsc_defaults/>
>    <op_defaults/>
>  </configuration>
>
> But when i try to replace this file it says:
>
>  Update does not conform to the configured schema/DTD
> <null>
>
> can anyone tell where is the mistake.
>
> --abhishek
>
>
> ------------------------------
>
> Message: 5
> Date: Tue, 28 Jul 2009 16:37:22 +0200
> From: [email protected]
> Subject: Re: [Linux-HA] stand_alone_ping: Node xx.yy.zz.ww is
>        unreachable (read)
> To: [email protected], General Linux-HA mailing list
>        <[email protected]>
> Message-ID:
>        <
> q273266430-31c6ba17533bb4e341c9e10f04944...@pmq4.mod5.onet.test.onet.pl>
>
> Content-Type: text/plain; charset=iso-8859-2
>
> Does anybody have a clue what is going on - is this a bug or a real problem
> with connection that is not notified by system ping.
>
> Is there any way to replace pingd with e.g. a bash script that check the
> connectivity and reports a connection status to the heartbeat system (e.g.
> resource is stopped or resource has failure)? So a score for a certain
> resource group is recalculated and in the case of connectivity problems the
> resource group is relocated to other machine. Is this practically possible
> to apply to the crm style configuration?
>
> E.g. bash subroutine:
>
> check_connection () {
> ?node=$1
> ?[ -z "$node" ] && return 1
> ?NPACKETS=3
> ?stat=0
> ?ping -n -q -c $NPACKETS "$node" >/dev/null 2>&1
> ?if [ "$?" -ne 0 ]; then
> ??echo "ERROR: Ping node $node does not answer to ICMP pings"
> ??stat=1
> ?else
> ?       echo "INFO: Ping node $node answers to ICMP pings"
> ?fi
> ?return $stat
> }
>
>
> I would be grateful for help,
>
> Jarek
>
>
> "General Linux-HA mailing list" <[email protected]> napisa?(a):
>  >
>  > I found additionally the error message attached below. Please advise.
>  >
>  > Thanks
>  > Jarek
>  >
>  > pingd[6890]: 2009/07/24_14:47:15 debug: stand_alone_ping: Node 3.27.60.1
> is alive
>  > pingd[6890]: 2009/07/24_14:47:15 debug: debug2: ping_close: Closed
> connection to 3.27.60.1
>  > pingd[6890]: 2009/07/24_14:47:15 debug: send_update: Sent update:
> pingd=1000 (1 active ping nodes)
>  > pingd[6890]: 2009/07/24_14:47:16 debug: debug2: stand_alone_ping:
> Checking connectivity
>  > pingd[6890]: 2009/07/24_14:47:16 debug: debug2: ping_open: Got address
> 3.27.60.1 for 3.27.60.1
>  > pingd[6890]: 2009/07/24_14:47:16 debug: debug2: ping_open: Opened
> connection to 3.27.60.1
>  > pingd[6890]: 2009/07/24_14:47:16 debug: debug2: ping_write: Sent 39
> bytes to 3.27.60.1
>  > pingd[6890]: 2009/07/24_14:47:16 debug: debug2: ping_read: Got 59 bytes
>  > No error message: -1: Resource temporarily unavailable (11)
>  > pingd[6890]: 2009/07/24_14:47:16 debug: process_icmp_error: No error
> message: -1: Resource temporarily unavailable (11)
>  > pingd[6890]: 2009/07/24_14:47:16 debug: debug2: dump_v4_echo: Echo from
> 3.27.60.1 (exp=1238, seq=18367, id=11669, dest=3.
>  > 27.60.1, data=pingd-v4): Echo Reply
>  > pingd[6890]: 2009/07/24_14:47:16 info: stand_alone_ping: Node 3.27.60.1
> is unreachable (read)
>  > pingd[6890]: 2009/07/24_14:47:17 debug: debug2: ping_write: Sent 39
> bytes to 3.27.60.1
>  > pingd[6890]: 2009/07/24_14:47:17 debug: debug2: ping_read: Got 59 bytes
>  > No error message: -1: Resource temporarily unavailable (11)
>  > pingd[6890]: 2009/07/24_14:47:17 debug: process_icmp_error: No error
> message: -1: Resource temporarily unavailable (11)
>  > pingd[6890]: 2009/07/24_14:47:17 debug: debug2: dump_v4_echo: Echo from
> 3.27.60.1 (exp=1239, seq=1238, id=6890, dest=3.27
>  > .60.1, data=pingd-v4): Echo Reply
>  > pingd[6890]: 2009/07/24_14:47:17 info: stand_alone_ping: Node 3.27.60.1
> is unreachable (read)
>  > pingd[6890]: 2009/07/24_14:47:18 debug: debug2: ping_close: Closed
> connection to 3.27.60.1
>  > pingd[6890]: 2009/07/24_14:47:18 debug: send_update: Sent update:
> pingd=0 (0 active ping nodes)
>  > pingd[6890]: 2009/07/24_14:47:18 debug: debug2: stand_alone_ping:
> Checking connectivity
>  > pingd[6890]: 2009/07/24_14:47:18 debug: debug2: ping_open: Got address
> 3.27.60.1 for 3.27.60.1
>  > pingd[6890]: 2009/07/24_14:47:18 debug: debug2: ping_open: Opened
> connection to 3.27.60.1
>  > pingd[6890]: 2009/07/24_14:47:18 debug: debug2: ping_write: Sent 39
> bytes to 3.27.60.1
>  > pingd[6890]: 2009/07/24_14:47:18 debug: debug2: ping_read: Got 59 bytes
>  > pingd[6890]: 2009/07/24_14:47:18 debug: debug2: dump_v4_echo: Echo from
> 3.27.60.1 (exp=1240, seq=1240, id=6890, dest=3.27
>  > .60.1, data=pingd-v4): Echo Reply
>  > pingd[6890]: 2009/07/24_14:47:18 debug: stand_alone_ping: Node 3.27.60.1
> is alive
>  > pingd[6890]: 2009/07/24_14:47:18 debug: debug2: ping_close: Closed
> connection to 3.27.60.1
>  > p
>  >
>  > "General Linux-HA mailing list" <[email protected]>
> napisa?(a):
>  >  > Below is part of the output with error message produced by command:
>  >  > /usr/lib64/heartbeat/pingd -VVV -a pingd -d 10 -m 1000 -h 3.27.60.1
>  >  >
>  >  > The machine has three network interfaces and is connected to three
> different subnets (3.27.x.x, 192.168.x.x - cluster subnet, 172.22.x.x -
> dedicated for heartbeat).
>  >  >
>  >  > pingd[6890]: 2009/07/24_14:44:36 debug: debug2: ping_close: Closed
> connection to 3.27.60.1
>  >  > pingd[6890]: 2009/07/24_14:44:36 debug: send_update: Sent update:
> pingd=1000 (1 active ping nodes)
>  >  > pingd[6890]: 2009/07/24_14:44:37 debug: debug2: stand_alone_ping:
> Checking connectivity
>  >  > pingd[6890]: 2009/07/24_14:44:37 debug: debug2: ping_open: Got
> address 3.27.60.1 for 3.27.60.1
>  >  > pingd[6890]: 2009/07/24_14:44:37 debug: debug2: ping_open: Opened
> connection to 3.27.60.1
>  >  > pingd[6890]: 2009/07/24_14:44:37 debug: debug2: ping_write: Sent 39
> bytes to 3.27.60.1
>  >  > pingd[6890]: 2009/07/24_14:44:38 debug: debug2: ping_read: Got 59
> bytes
>  >  > pingd[6890]: 2009/07/24_14:44:38 debug: debug2: dump_v4_echo: Echo
> from 3.27.60.1 (exp=1080, seq=1080, id=6890, dest=3.27.60.1, data=pingd-v4):
> Echo Reply
>  >  > pingd[6890]: 2009/07/24_14:44:38 debug: stand_alone_ping: Node
> 3.27.60.1 is alive
>  >  > pingd[6890]: 2009/07/24_14:44:38 debug: debug2: ping_close: Closed
> connection to 3.27.60.1
>  >  > pingd[6890]: 2009/07/24_14:44:38 debug: send_update: Sent update:
> pingd=1000 (1 active ping nodes)
>  >  > pingd[6890]: 2009/07/24_14:44:38 debug: debug2: stand_alone_ping:
> Checking connectivity
>  >  > pingd[6890]: 2009/07/24_14:44:38 debug: debug2: ping_open: Got
> address 3.27.60.1 for 3.27.60.1
>  >  > pingd[6890]: 2009/07/24_14:44:38 debug: debug2: ping_open: Opened
> connection to 3.27.60.1
>  >  > pingd[6890]: 2009/07/24_14:44:38 debug: debug2: ping_write: Sent 39
> bytes to 3.27.60.1
>  >  > pingd[6890]: 2009/07/24_14:44:39 debug: debug2: ping_read: Got 262
> bytes
>  >  > No error message: -1: Resource temporarily unavailable (11)
>  >  > pingd[6890]: 2009/07/24_14:44:39 debug: process_icmp_error: No error
> message: -1: Resource temporarily unavailable (11)
>  >  > pingd[6890]: 2009/07/24_14:44:39 debug: debug2: dump_v4_echo: Echo
> from 172.22.10.2 (exp=1081, seq=0, id=0, dest=3.27.60.1, data=E?):
> Unreachable Port
>  >  > pingd[6890]: 2009/07/24_14:44:39 info: stand_alone_ping: Node
> 3.27.60.1 is unreachable (read)
>  >  > pingd[6890]: 2009/07/24_14:44:40 debug: debug2: ping_write: Sent 39
> bytes to 3.27.60.1
>  >  > pingd[6890]: 2009/07/24_14:44:40 debug: debug2: ping_read: Got 262
> bytes
>  >  > No error message: -1: Resource temporarily unavailable (11)
>  >  > pingd[6890]: 2009/07/24_14:44:40 debug: process_icmp_error: No error
> message: -1: Resource temporarily unavailable (11)
>  >  > pingd[6890]: 2009/07/24_14:44:40 debug: debug2: dump_v4_echo: Echo
> from 192.168.0.5 (exp=1082, seq=0, id=0, dest=3.27.60.1, data=E?):
> Unreachable Port
>  >  > pingd[6890]: 2009/07/24_14:44:40 info: stand_alone_ping: Node
> 3.27.60.1 is unreachable (read)
>  >  > pingd[6890]: 2009/07/24_14:44:41 debug: debug2: ping_close: Closed
> connection to 3.27.60.1
>  >  > pingd[6890]: 2009/07/24_14:44:41 debug: send_update: Sent update:
> pingd=0 (0 active ping nodes)
>  >  > pingd[6890]: 2009/07/24_14:44:41 debug: debug2: stand_alone_ping:
> Checking connectivity
>  >  > pingd[6890]: 2009/07/24_14:44:41 debug: debug2: ping_open: Got
> address 3.27.60.1 for 3.27.60.1
>  >  > pingd[6890]: 2009/07/24_14:44:41 debug: debug2: ping_open: Opened
> connection to 3.27.60.1
>  >  > pingd[6890]: 2009/07/24_14:44:41 debug: debug2: ping_write: Sent 39
> bytes to 3.27.60.1
>  >  > pingd[6890]: 2009/07/24_14:44:41 debug: debug2: ping_read: Got 59
> bytes
>  >  > pingd[6890]: 2009/07/24_14:44:41 debug: debug2: dump_v4_echo: Echo
> from 3.27.60.1 (exp=1083, seq=1083, id=6890, dest=3.27.60.1, data=pingd-v4):
> Echo Reply
>  >  > pingd[6890]: 2009/07/24_14:44:41 debug: stand_alone_ping: Node
> 3.27.60.1 is alive
>  >  > pingd[6890]: 2009/07/24_14:44:41 debug: debug2: ping_close: Closed
> connection to 3.27.60.1
>  >  > pingd[6890]: 2009/07/24_14:44:41 debug: send_update: Sent update:
> pingd=1000 (1 active ping nodes)
>  >  >
>  >  > Thanks
>  >  > Jarek
>  >  >
>  >  > "General Linux-HA mailing list" <[email protected]>
> napisa?(a):
>  >  >  > 2009/7/24  <[email protected]>:
>  >  >  > >
>  >  >  > > Rpm built for RHEL5:
>  >  >  > > heartbeat-common-2.99.2-8.1
>  >  >  > > libheartbeat2-2.99.2-8.1
>  >  >  > > heartbeat-2.99.2-8.1
>  >  >  > > heartbeat-resources-2.99.2-8.1
>  >  >  > > pacemaker-1.0.3-2.2
>  >  >  > > pacemaker-mgmt-client-1.99.1-2.1
>  >  >  > > libpacemaker3-1.0.3-2.2
>  >  >  > > pacemaker-mgmt-1.99.1-2.1
>  >  >  > >
>  >  >  > > If i start pingd manually (beside working heartbeat+pacemaker)
> it gives me following when in /var/log/ha-debug appears "stand_alone_ping:
> Node xx.yy.zz.ww is unreachable (read)":
>  >  >  > >
>  >  >  > > [r...@gate2]# date ;/usr/lib64/heartbeat/pingd -a pingd -d 10
> -m 1000 -h xx.yy.zz.ww; date
>  >  >  > > Thu Jul 23 19:25:24 CEST 2009
>  >  >  > > No error message: -1: Resource temporarily unavailable (11)
>  >  >  > > No error message: -1: Resource temporarily unavailable (11)
>  >  >  > > No error message: -1: Resource temporarily unavailable (11)
>  >  >  > > No error message: -1: Resource temporarily unavailable (11)
>  >  >  > > No error message: -1: Resource temporarily unavailable (11)
>  >  >  > > No error message: -1: Resource temporarily unavailable (11)
>  >  >  > > ...
>  >  >  > >
>  >  >  > > System ping reports no errors.
>  >  >  > >
>  >  >  >
>  >  >  > If you repeat that test with some extra -V arguments, you should
> see
>  >  >  > more information (which would be helpful).
>  >  >  > But its pretty clear there must be a bug, so its probably worth
>  >  >  > creating an entry in bugzilla.
>  >  >  > _______________________________________________
>  >  >  > Linux-HA mailing list
>  >  >  > [email protected]
>  >  >  > http://lists.linux-ha.org/mailman/listinfo/linux-ha
>  >  >  > See also: http://linux-ha.org/ReportingProblems
>  >  >
>  >  > _______________________________________________
>  >  > Linux-HA mailing list
>  >  > [email protected]
>  >  > http://lists.linux-ha.org/mailman/listinfo/linux-ha
>  >  > See also: http://linux-ha.org/ReportingProblems
>  >
>  > _______________________________________________
>  > Linux-HA mailing list
>  > [email protected]
>  > http://lists.linux-ha.org/mailman/listinfo/linux-ha
>  > See also: http://linux-ha.org/ReportingProblems
>
>
>
> ------------------------------
>
> Message: 6
> Date: Tue, 28 Jul 2009 16:55:21 +0200
> From: Miguel Olivares <[email protected]>
> Subject: [Linux-HA] HA cluster 2 IP Service and bonding
> To: [email protected]
> Message-ID: <[email protected]>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> Hello,
>
> I have 2 servers in mi cluster with 4 Ethernet cards each one,  i used
> bonding in order to have full redundancy. i didn't have any problem with
> my configuration, it works but not as well as i expect. What i want to
> do is when one of the two bonding interfaces "bond0" or "bond1" does not
> respond  in the primary server this servers gives up, and the other
> server can take the service, but i don't know how to do that. i put my
> config files ha.cf and haresources.
>
> eth0 and eth1 -> bond0 #on Server1 and Server2
> eth2 and eth3  -> bond1 #on Server1 and Server2
>
> 192.168.1.10  Server1 # bond0
> 192.168.1.11   Server2 # bond0
> 192.168.1.12  virtual IP1
>
> 172.16.10.10 Server1 #bond1
> 172.16.10 11  Server2 #bond1
> 172.16.10.12 virtual IP2
>
> in my configuration i saw both virtual's IP  in Server1, but when i put
> down bond1 Server1 still continue as the primary node.
> does anybody help me
>
> Thanks
>
> regards.
>
>
> [ha.cf]
> debugfile /var/log/ha-debug
> logfile /var/log/ha-log
> keepalive 2
> deadtime 30
> warntime 10
> initdead 60
> udpport        694
> bcast   bond0 bond1
> auto_failback off
> node    Server1 Server2
> ping 192.168.1.254
> ping 172.16.10.254
>
>
> [haresources]
> Server1  192.168.1.12   172.16.1.12
>
>
> ------------------------------
>
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
> End of Linux-HA Digest, Vol 68, Issue 63
> ****************************************
>



-- 
Regards,

Ahmed Munir

<cib epoch="7" num_updates="3" admin_epoch="0" validate-with="pacemaker-1.0" crm_feature_set="3.0.1" have-quorum="1" cib-last-written="Wed Jul 29 15:27:45 2009" dc-uuid="70503c2e-bb4a-48f8-aab3-53696656a4d0">
  <configuration>
    <crm_config>
      <cluster_property_set id="cib-bootstrap-options">
       <nvpair id="cib-bootstrap-options-dc-version" name="dc-version" value="1.0.4-6dede86d6105786af3a5321ccf66b44b6914f0aa"/>
        <nvpair id="cib-bootstrap-options-cluster-infrastructure" name="cluster-infrastructure" value="Heartbeat"/>
       </attributes>
      </cluster_property_set>
    </crm_config>
    <nodes>
      <node id="70503c2e-bb4a-48f8-aab3-53696656a4d0" uname="ha2" type="normal"/>
      <node id="e651c120-b9a1-489a-baf7-caf0028ad540" uname="ha1" type="normal"/>
    </nodes>
    <resources>
<primitive class="ocf" provider="heartbeat" type="IPaddr" id="IPaddr_1">
         <operations>
           <op id="IPaddr_1_mon" interval="10s" name="monitor" timeout="8s"/>
         </operations>
         <instance_attributes id="IPaddr_1_inst_attr">
           <attributes>
             <nvpair name="ip" value="192.168.0.184" id="IPaddr_1_machine_1"/>
           </attributes>
         </instance_attributes>
       </primitive>
       <primitive class="ocf" provider="heartbeat" type="IPaddr" id="IPaddr_2">
         <operations>
           <op id="IPaddr_2_mon" interval="10s" name="monitor" timeout="8s"/>
         </operations>
         <instance_attributes id="IPaddr_2_inst_attr">
           <attributes>
             <nvpair name="ip" value="192.168.0.185" id="IPaddr_2_machine_2"/>
           </attributes>
         </instance_attributes>
       </primitive>
</resources>
    <constraints>

<rsc_location id="rsc_location_IPaddr_1" rsc="IPaddr_1">
         <rule id="prefered_location_IPaddr_1" score="200">
           <expression attribute="#uname" id="prefered_location_IPaddr_1_expr" operation="eq" value="ha1"/>
         </rule>
       </rsc_location>
       <rsc_location id="rsc_location_IPaddr_2" rsc="IPaddr_2">
         <rule id="prefered_location_IPaddr_2" score="200">
           <expression attribute="#uname" id="prefered_location_IPaddr_2_expr" operation="eq" value="ha2"/>
         </rule>
       </rsc_location>
       <rsc_location id="my1_resource1:connected" rsc="IPaddr_1">
         <rule id="my1_resource1:connected:rule" score_attribute="pingd">
           <expression id="my1_resource1:connected:expr:defined" attribute="pingd" operation="defined"/>
         </rule>
       </rsc_location>
       <rsc_location id="my2_resource2:connected" rsc="IPaddr_2">
         <rule id="my2_resource2:connected:rule" score_attribute="pingd">
           <expression id="my2_resource2:connected:expr:defined" attribute="pingd" operation="defined"/>
         </rule>
       </rsc_location>

    </constraints>
  </configuration>
  <status>
    <node_state id="70503c2e-bb4a-48f8-aab3-53696656a4d0" uname="ha2" ha="active" in_ccm="true" crmd="online" join="member" expected="member" crm-debug-origin="do_state_transition" shutdown="0">
      <lrm id="70503c2e-bb4a-48f8-aab3-53696656a4d0">
        <lrm_resources/>
      </lrm>
      <transient_attributes id="70503c2e-bb4a-48f8-aab3-53696656a4d0">
        <instance_attributes id="status-70503c2e-bb4a-48f8-aab3-53696656a4d0">
          <nvpair id="status-70503c2e-bb4a-48f8-aab3-53696656a4d0-probe_complete" name="probe_complete" value="true"/>
          <nvpair id="status-70503c2e-bb4a-48f8-aab3-53696656a4d0-pingd" name="pingd" value="100"/>
        </instance_attributes>
      </transient_attributes>
    </node_state>
  </status>
</cib>

ha.cf
Description: Binary data

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Linux-HA Digest, Vol 68, Issue 63

Reply via email to