Re: [Linux-HA] Linux-HA Digest, Vol 68, Issue 63

Andrew Beekhof Wed, 29 Jul 2009 05:49:49 -0700

Please read the "What Changed" section of Configuration Explained 1.0
The section on upgrading the syntax might be of some help too.


Btw. replying to digests is highly annoying.

On Wed, Jul 29, 2009 at 1:01 PM, Ahmed Munir<[email protected]> wrote:
> Hi,
>
> With the reference to current Message: 2, I've updated my heartbeat 2.1.3-3
> to 2.99 and also installed updated version of pacemaker i.e. 1.0.4. I'm
> facing problem when I  edit/allocate resources using cibadmin -Q > a.xml and
> replace the file using this command cibadmin -R -x a.xml, it gives me an
> error as listed down below;
>
> Call cib_replace failed (-47): Update does not conform to the configured
> schema/DTD
> <null>
>
> When I verify the file using crm_verfiy -x a.xml, it shows followinh errors
> as listed below;
>
> a.xml:19: element attributes: Relax-NG validity error : Element
> instance_attributes has extra content: attributes
> Relax-NG validity error : Extra element instance_attributes in interleave
> a.xml:18: element instance_attributes: Relax-NG validity error : Element
> primitive failed to validate content
> a.xml:14: element primitive: Relax-NG validity error : Element resources has
> extra content: primitive
> a.xml:2: element configuration: Relax-NG validity error : Invalid sequence
> in interleave
> a.xml:1: element cib: Relax-NG validity error : Element cib failed to
> validate content
> crm_verify[3524]: 2009/07/29_16:11:54 ERROR: main: CIB did not pass
> DTD/schema validation
> Errors found during check: config not valid
>
>
> I was using this resource configuration (a.xml) in heartbeat 2.1.3, and on
> that time I wasn't not facing any errors.
>
> I'm attaching my configuration files, kindly reply it soon.
>
> Regards,
> Ahmed Munir
>
> On Tue, Jul 28, 2009 at 7:12 PM, <[email protected]>wrote:
>
>> Send Linux-HA mailing list submissions to
>>        [email protected]
>>
>> To subscribe or unsubscribe via the World Wide Web, visit
>>        http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> or, via email, send a message with subject or body 'help' to
>>        [email protected]
>>
>> You can reach the person managing the list at
>>        [email protected]
>>
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of Linux-HA digest..."
>>
>>
>> Today's Topics:
>>
>>   1. Re: get attribute value from commandline (Andrew Beekhof)
>>   2. Re: Linux-HA Digest, Vol 68, Issue 56 (Andrew Beekhof)
>>   3. Re: CRM issues (Andrew Beekhof)
>>   4. problems in adding time based rule to IPaddr resource
>>      (abhishek agrawal)
>>   5. Re: stand_alone_ping: Node xx.yy.zz.ww is unreachable (read)
>>      ([email protected])
>>   6. HA cluster 2 IP Service and bonding (Miguel Olivares)
>>
>>
>> ----------------------------------------------------------------------
>>
>> Message: 1
>> Date: Tue, 28 Jul 2009 13:52:23 +0200
>> From: Andrew Beekhof <[email protected]>
>> Subject: Re: [Linux-HA] get attribute value from commandline
>> To: General Linux-HA mailing list <[email protected]>
>> Message-ID:
>>        <[email protected]>
>> Content-Type: text/plain; charset=ISO-8859-1
>>
>> Newer versions allow xpath queries. In your case, you'd run:
>>  cibadmin --query --xpath "//nvpa...@name='pingd']"
>>
>> On Tue, Jul 28, 2009 at 6:53 AM, MAHESH, SIDDACHETTY M (SIDDACHETTY
>> M)<[email protected]> wrote:
>> > Hi,
>> >
>> > I need to find out what the attribute value and calculated score is for a
>> resource. Is it possible to get this info from the command line using some
>> utility?
>> >
>> >
>> > For example, my cib.xml has this entry
>> >
>> > <rsc_location id="ipaddress_connected" rsc="ip_group">
>> > ? ? ? ? <rule id="ipaddress_connected_rule" score="-INFINITY"
>> boolean_op="or">
>> > ? ? ? ? ? <expression id="ipaddress_connected_rule_expr_undefined"
>> attribute="pingd" operation="not_defined"/>
>> > ? ? ? ? ? <expression id="ipaddress_connected_rule_expr_zero"
>> attribute="pingd" operation="lte" value="0"/>
>> > ? ? ? ? </rule>
>> > </rsc_location>
>> >
>> >
>> > Is it possible to get the value of the 'pingd' attribute? Also, is it
>> possible to determine what the calculated score is for the 'ip_group'
>> resource? I tried to determine the failure count using the 'crm_failcount'
>> utility but it always reports a value of '1' even on multiple failures of
>> the 'ip_group' resource ('crm_failcount -V -G -r ip_group'). Is there a
>> means to detecting how many times a resource has failed?
>> >
>> > Thanks,
>> > Mahesh
>> >
>> > _______________________________________________
>> > Linux-HA mailing list
>> > [email protected]
>> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> > See also: http://linux-ha.org/ReportingProblems
>> >
>>
>>
>> ------------------------------
>>
>> Message: 2
>> Date: Tue, 28 Jul 2009 13:53:50 +0200
>> From: Andrew Beekhof <[email protected]>
>> Subject: Re: [Linux-HA] Linux-HA Digest, Vol 68, Issue 56
>> To: General Linux-HA mailing list <[email protected]>
>> Message-ID:
>>        <[email protected]>
>> Content-Type: text/plain; charset=ISO-8859-1
>>
>> On Mon, Jul 27, 2009 at 5:47 AM, Ahmed Munir<[email protected]>
>> wrote:
>> > Thanks for replying Mr. Andrew Beekhof.
>> >
>> > With reference to Message:1, I'm using Centos 5.3 Linux and the version
>> for
>> > heartbeat I'm using is 2.1.3-3
>> >
>> > Kindly let me know if there is a bug in this version
>>
>> 2.1.3 was quite some time ago, there are definitely bugs in it
>>
>> > and also do please
>> > mention how to fix it.
>>
>> Update to a recent version of pacemaker
>>
>>
>> ------------------------------
>>
>> Message: 3
>> Date: Tue, 28 Jul 2009 13:57:38 +0200
>> From: Andrew Beekhof <[email protected]>
>> Subject: Re: [Linux-HA] CRM issues
>> To: General Linux-HA mailing list <[email protected]>
>> Message-ID:
>>        <[email protected]>
>> Content-Type: text/plain; charset=ISO-8859-1
>>
>> On Wed, Jul 8, 2009 at 5:58 PM, Bret E. Palsson<[email protected]> wrote:
>> > All was done in reference to
>> http://clusterlabs.org/mediawiki/images/8/8d/Crm_cli.pdf ? pages 3 and 4.
>> >
>> > When pasting the following I get errors. However if I enter the crm and
>> paste line by line I don't get errors and everything works dandy. Any
>> suggestions on how I can "script" this configuration?
>>
>> It should work, but Dejan's the one that understands this stuff (and
>> he's on vacation).
>> I wonder if its an escaping thing, perhaps $id is being expanded...
>>
>> Then again, maybe its a bug thats been fixed since.  What version are
>> you running?
>>
>> >
>> > I've also tried: crm configure show > backup ?(Obviously after having a
>> valid configuration to backup)
>> >
>> > and then pasted the contents of the backup after the erase command below.
>> That didn't work either.
>> >
>> > crm<<EOF
>> > configure
>> > erase
>> > primitive virtual_ip ocf:heartbeat:IPaddr2 operations
>> $id="virtual_ip-operations" op monitor interval="10s" timeout="20s"
>> start-delay="5s" params ip="10.130.0.5" nic="eth0" cidr_netmask="16" meta
>> $id="virtual_ip-meta_attributes"
>> > primitive pgpool-ha ocf:pacemaker:pgpoolha operations
>> $id="pgpool-ha-operations" op monitor interval="10" timeout="20"
>> start-delay="0" params pgpool="/usr/bin/pgpool"
>> pgpoolconf="/etc/pgpool.conf" pcpconf="/etc/pcp.conf"
>> pool_hbaconf="/etc/pool_hba.conf" forcestop="10" meta
>> $id="pgpool-ha-meta_attributes"
>> > primitive citrix-stonith stonith:external/citrix-xenserver operations
>> $id="citrix-stonith-operations" op monitor interval="15" timeout="15"
>> start-delay="15" params hostlist="dbcontroller1.net:
>> dbcontroller1.master,dbcontroller2.net:dbcontroller2.master"
>> poolMaster="10.128.250.1" poolMasterUserName="root"
>> poolMasterPassword="nada" meta $id="citrix-stonith-meta_attributes"
>> > group pgpool-ha-group virtual_ip pgpool-ha meta target-role="started"
>> > clone stone-citrix citrix-stonith meta target-role="started"
>> > property $id="cib-bootstrap-options" expected-quorum-votes="2"
>> no-quorum-policy="ignore" stonith-timeout="30s" default-action-timeout="30s"
>> cluster-delay="30s"
>> > rsc_defaults $id="rsc_defaults-options" resource-stickiness="INFINITY"
>> > commit
>> > EOF
>> >
>> > OUTPUT:
>> > element nvpair: Relax-NG validity error : Expecting element op, got
>> nvpair
>> > Relax-NG validity error : Extra element operations in interleave
>> > element operations: Relax-NG validity error : Element primitive failed to
>> validate content
>> > element group: Relax-NG validity error : Invalid sequence in interleave
>> > element group: Relax-NG validity error : Element group failed to validate
>> content
>> > element cib: Relax-NG validity error : Element cib failed to validate
>> content
>> > crm_verify[3180]: 2009/07/08_02:02:08 ERROR: main: CIB did not pass
>> DTD/schema validation
>> > Errors found during check: config not valid
>> > WARNING: 10: crm_verify(8) found errors in the CIB
>> > INFO: 10: use commit force if you know what you are doing
>> > Call cib_modify failed (-47): Update does not conform to the configured
>> schema/DTD
>> > <null>
>> > _______________________________________________
>> > Linux-HA mailing list
>> > [email protected]
>> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> > See also: http://linux-ha.org/ReportingProblems
>> >
>>
>>
>> ------------------------------
>>
>> Message: 4
>> Date: Tue, 28 Jul 2009 18:29:29 +0530
>> From: abhishek agrawal <[email protected]>
>> Subject: [Linux-HA] problems in adding time based rule to IPaddr
>>        resource
>> To: [email protected]
>> Message-ID:
>>        <[email protected]>
>> Content-Type: text/plain; charset=ISO-8859-1
>>
>> i was trying to add a simple rule to achieve time dependent target-role.
>> before addition f rule the following CIB was working.
>> <cib validate-with="pacemaker-1.0" crm_feature_set="3.0.1" have-quorum="0"
>> admin_epoch="0" epoch="34" num_updates="5" cib-last-written="Sat Jul 25
>> 22:38:00 2009" dc-uuid="7d28a8e7-3948-42af-8308-5275972f2e2a">
>>  <configuration>
>>    <crm_config>
>>      <cluster_property_set id="cib-bootstrap-options">
>>        <nvpair id="cib-bootstrap-options-dc-version" name="dc-version"
>> value="1.0.4-6dede86d6105786af3a5321ccf66b44b6914f0aa"/>
>>        <nvpair id="cib-bootstrap-options-cluster-infrastructure"
>> name="cluster-infrastructure" value="Heartbeat"/>
>>        <nvpair id="cib-bootstrap-options-stonith-enabled"
>> name="stonith-enabled" value="false"/>
>>        <nvpair id="cib-bootstrap-options-no-quorum-policy"
>> name="no-quorum-policy" value="ignore"/>
>>      </cluster_property_set>
>>    </crm_config>
>>    <nodes>
>>      <node id="7d28a8e7-3948-42af-8308-5275972f2e2a" uname="kf-cent-dm2"
>> type="normal"/>
>>      <node id="9d0d3088-b98a-4bc0-a8da-c500176a799c" uname="kf-cent-dm1"
>> type="normal"/>
>>    </nodes>
>>    <resources>
>>      <primitive class="ocf" id="failover-ip" provider="heartbeat"
>> type="IPaddr">
>>        <instance_attributes id="failover-ip-instance_attributes">
>>          <nvpair id="failover-ip-instance_attributes-ip" name="ip"
>> value="15.154.59.49"/>
>>        </instance_attributes>
>>        <operations>
>>          <op id="failover-ip-monitor-5s" interval="5s" name="monitor"/>
>>        </operations>
>>        <meta_attributes id="core-hours" score="10">
>>          <nvpair id="core-hour-role" name="target-role" value="started"/>
>>        </meta_attributes>
>>        <meta_attributes id="after-hours" score="5">
>>          <nvpair id="after-hour-role" name="target-role" value="stopped"/>
>>        </meta_attributes>
>>      </primitive>
>>    </resources>
>>    <constraints/>
>>    <rsc_defaults/>
>>    <op_defaults/>
>>  </configuration>
>>
>> I was trying to add following rule:
>>
>> <rule id="core-hour-rule">
>> <date_expression id="9to5" operation="date_spec">
>> <date_spec hours="9-17"/>
>> </date_expression>
>> </rule>
>>
>> so my modified cib.xml look like following:
>>
>> <cib validate-with="pacemaker-1.0" crm_feature_set="3.0.1" have-quorum="0"
>> admin_epoch="0" epoch="34" num_updates="5" cib-last-written="Sat Jul 25
>> 22:38:00 2009" dc-uuid="7d28a8e7-3948-42af-8308-5275972f2e2a">
>>  <configuration>
>>    <crm_config>
>>      <cluster_property_set id="cib-bootstrap-options">
>>        <nvpair id="cib-bootstrap-options-dc-version" name="dc-version"
>> value="1.0.4-6dede86d6105786af3a5321ccf66b44b6914f0aa"/>
>>        <nvpair id="cib-bootstrap-options-cluster-infrastructure"
>> name="cluster-infrastructure" value="Heartbeat"/>
>>        <nvpair id="cib-bootstrap-options-stonith-enabled"
>> name="stonith-enabled" value="false"/>
>>        <nvpair id="cib-bootstrap-options-no-quorum-policy"
>> name="no-quorum-policy" value="ignore"/>
>>      </cluster_property_set>
>>    </crm_config>
>>    <nodes>
>>      <node id="7d28a8e7-3948-42af-8308-5275972f2e2a" uname="kf-cent-dm2"
>> type="normal"/>
>>      <node id="9d0d3088-b98a-4bc0-a8da-c500176a799c" uname="kf-cent-dm1"
>> type="normal"/>
>>    </nodes>
>>    <resources>
>>      <primitive class="ocf" id="failover-ip" provider="heartbeat"
>> type="IPaddr">
>>        <instance_attributes id="failover-ip-instance_attributes">
>>          <nvpair id="failover-ip-instance_attributes-ip" name="ip"
>> value="15.154.59.49"/>
>>        </instance_attributes>
>>        <operations>
>>          <op id="failover-ip-monitor-5s" interval="5s" name="monitor"/>
>>        </operations>
>>        <meta_attributes id="core-hours" score="10">
>> <rule id="core-hour-rule">
>> <date_expression id="9to5" operation="date_spec">
>> <date_spec hours="9-17"/>
>> </date_expression>
>> </rule>
>>          <nvpair id="core-hour-role" name="target-role" value="started"/>
>>        </meta_attributes>
>>        <meta_attributes id="after-hours" score="5">
>>          <nvpair id="after-hour-role" name="target-role" value="stopped"/>
>>        </meta_attributes>
>>      </primitive>
>>    </resources>
>>    <constraints/>
>>    <rsc_defaults/>
>>    <op_defaults/>
>>  </configuration>
>>
>> But when i try to replace this file it says:
>>
>>  Update does not conform to the configured schema/DTD
>> <null>
>>
>> can anyone tell where is the mistake.
>>
>> --abhishek
>>
>>
>> ------------------------------
>>
>> Message: 5
>> Date: Tue, 28 Jul 2009 16:37:22 +0200
>> From: [email protected]
>> Subject: Re: [Linux-HA] stand_alone_ping: Node xx.yy.zz.ww is
>>        unreachable (read)
>> To: [email protected], General Linux-HA mailing list
>>        <[email protected]>
>> Message-ID:
>>        <
>> q273266430-31c6ba17533bb4e341c9e10f04944...@pmq4.mod5.onet.test.onet.pl>
>>
>> Content-Type: text/plain; charset=iso-8859-2
>>
>> Does anybody have a clue what is going on - is this a bug or a real problem
>> with connection that is not notified by system ping.
>>
>> Is there any way to replace pingd with e.g. a bash script that check the
>> connectivity and reports a connection status to the heartbeat system (e.g.
>> resource is stopped or resource has failure)? So a score for a certain
>> resource group is recalculated and in the case of connectivity problems the
>> resource group is relocated to other machine. Is this practically possible
>> to apply to the crm style configuration?
>>
>> E.g. bash subroutine:
>>
>> check_connection () {
>> ?node=$1
>> ?[ -z "$node" ] && return 1
>> ?NPACKETS=3
>> ?stat=0
>> ?ping -n -q -c $NPACKETS "$node" >/dev/null 2>&1
>> ?if [ "$?" -ne 0 ]; then
>> ??echo "ERROR: Ping node $node does not answer to ICMP pings"
>> ??stat=1
>> ?else
>> ?       echo "INFO: Ping node $node answers to ICMP pings"
>> ?fi
>> ?return $stat
>> }
>>
>>
>> I would be grateful for help,
>>
>> Jarek
>>
>>
>> "General Linux-HA mailing list" <[email protected]> napisa?(a):
>>  >
>>  > I found additionally the error message attached below. Please advise.
>>  >
>>  > Thanks
>>  > Jarek
>>  >
>>  > pingd[6890]: 2009/07/24_14:47:15 debug: stand_alone_ping: Node 3.27.60.1
>> is alive
>>  > pingd[6890]: 2009/07/24_14:47:15 debug: debug2: ping_close: Closed
>> connection to 3.27.60.1
>>  > pingd[6890]: 2009/07/24_14:47:15 debug: send_update: Sent update:
>> pingd=1000 (1 active ping nodes)
>>  > pingd[6890]: 2009/07/24_14:47:16 debug: debug2: stand_alone_ping:
>> Checking connectivity
>>  > pingd[6890]: 2009/07/24_14:47:16 debug: debug2: ping_open: Got address
>> 3.27.60.1 for 3.27.60.1
>>  > pingd[6890]: 2009/07/24_14:47:16 debug: debug2: ping_open: Opened
>> connection to 3.27.60.1
>>  > pingd[6890]: 2009/07/24_14:47:16 debug: debug2: ping_write: Sent 39
>> bytes to 3.27.60.1
>>  > pingd[6890]: 2009/07/24_14:47:16 debug: debug2: ping_read: Got 59 bytes
>>  > No error message: -1: Resource temporarily unavailable (11)
>>  > pingd[6890]: 2009/07/24_14:47:16 debug: process_icmp_error: No error
>> message: -1: Resource temporarily unavailable (11)
>>  > pingd[6890]: 2009/07/24_14:47:16 debug: debug2: dump_v4_echo: Echo from
>> 3.27.60.1 (exp=1238, seq=18367, id=11669, dest=3.
>>  > 27.60.1, data=pingd-v4): Echo Reply
>>  > pingd[6890]: 2009/07/24_14:47:16 info: stand_alone_ping: Node 3.27.60.1
>> is unreachable (read)
>>  > pingd[6890]: 2009/07/24_14:47:17 debug: debug2: ping_write: Sent 39
>> bytes to 3.27.60.1
>>  > pingd[6890]: 2009/07/24_14:47:17 debug: debug2: ping_read: Got 59 bytes
>>  > No error message: -1: Resource temporarily unavailable (11)
>>  > pingd[6890]: 2009/07/24_14:47:17 debug: process_icmp_error: No error
>> message: -1: Resource temporarily unavailable (11)
>>  > pingd[6890]: 2009/07/24_14:47:17 debug: debug2: dump_v4_echo: Echo from
>> 3.27.60.1 (exp=1239, seq=1238, id=6890, dest=3.27
>>  > .60.1, data=pingd-v4): Echo Reply
>>  > pingd[6890]: 2009/07/24_14:47:17 info: stand_alone_ping: Node 3.27.60.1
>> is unreachable (read)
>>  > pingd[6890]: 2009/07/24_14:47:18 debug: debug2: ping_close: Closed
>> connection to 3.27.60.1
>>  > pingd[6890]: 2009/07/24_14:47:18 debug: send_update: Sent update:
>> pingd=0 (0 active ping nodes)
>>  > pingd[6890]: 2009/07/24_14:47:18 debug: debug2: stand_alone_ping:
>> Checking connectivity
>>  > pingd[6890]: 2009/07/24_14:47:18 debug: debug2: ping_open: Got address
>> 3.27.60.1 for 3.27.60.1
>>  > pingd[6890]: 2009/07/24_14:47:18 debug: debug2: ping_open: Opened
>> connection to 3.27.60.1
>>  > pingd[6890]: 2009/07/24_14:47:18 debug: debug2: ping_write: Sent 39
>> bytes to 3.27.60.1
>>  > pingd[6890]: 2009/07/24_14:47:18 debug: debug2: ping_read: Got 59 bytes
>>  > pingd[6890]: 2009/07/24_14:47:18 debug: debug2: dump_v4_echo: Echo from
>> 3.27.60.1 (exp=1240, seq=1240, id=6890, dest=3.27
>>  > .60.1, data=pingd-v4): Echo Reply
>>  > pingd[6890]: 2009/07/24_14:47:18 debug: stand_alone_ping: Node 3.27.60.1
>> is alive
>>  > pingd[6890]: 2009/07/24_14:47:18 debug: debug2: ping_close: Closed
>> connection to 3.27.60.1
>>  > p
>>  >
>>  > "General Linux-HA mailing list" <[email protected]>
>> napisa?(a):
>>  >  > Below is part of the output with error message produced by command:
>>  >  > /usr/lib64/heartbeat/pingd -VVV -a pingd -d 10 -m 1000 -h 3.27.60.1
>>  >  >
>>  >  > The machine has three network interfaces and is connected to three
>> different subnets (3.27.x.x, 192.168.x.x - cluster subnet, 172.22.x.x -
>> dedicated for heartbeat).
>>  >  >
>>  >  > pingd[6890]: 2009/07/24_14:44:36 debug: debug2: ping_close: Closed
>> connection to 3.27.60.1
>>  >  > pingd[6890]: 2009/07/24_14:44:36 debug: send_update: Sent update:
>> pingd=1000 (1 active ping nodes)
>>  >  > pingd[6890]: 2009/07/24_14:44:37 debug: debug2: stand_alone_ping:
>> Checking connectivity
>>  >  > pingd[6890]: 2009/07/24_14:44:37 debug: debug2: ping_open: Got
>> address 3.27.60.1 for 3.27.60.1
>>  >  > pingd[6890]: 2009/07/24_14:44:37 debug: debug2: ping_open: Opened
>> connection to 3.27.60.1
>>  >  > pingd[6890]: 2009/07/24_14:44:37 debug: debug2: ping_write: Sent 39
>> bytes to 3.27.60.1
>>  >  > pingd[6890]: 2009/07/24_14:44:38 debug: debug2: ping_read: Got 59
>> bytes
>>  >  > pingd[6890]: 2009/07/24_14:44:38 debug: debug2: dump_v4_echo: Echo
>> from 3.27.60.1 (exp=1080, seq=1080, id=6890, dest=3.27.60.1, data=pingd-v4):
>> Echo Reply
>>  >  > pingd[6890]: 2009/07/24_14:44:38 debug: stand_alone_ping: Node
>> 3.27.60.1 is alive
>>  >  > pingd[6890]: 2009/07/24_14:44:38 debug: debug2: ping_close: Closed
>> connection to 3.27.60.1
>>  >  > pingd[6890]: 2009/07/24_14:44:38 debug: send_update: Sent update:
>> pingd=1000 (1 active ping nodes)
>>  >  > pingd[6890]: 2009/07/24_14:44:38 debug: debug2: stand_alone_ping:
>> Checking connectivity
>>  >  > pingd[6890]: 2009/07/24_14:44:38 debug: debug2: ping_open: Got
>> address 3.27.60.1 for 3.27.60.1
>>  >  > pingd[6890]: 2009/07/24_14:44:38 debug: debug2: ping_open: Opened
>> connection to 3.27.60.1
>>  >  > pingd[6890]: 2009/07/24_14:44:38 debug: debug2: ping_write: Sent 39
>> bytes to 3.27.60.1
>>  >  > pingd[6890]: 2009/07/24_14:44:39 debug: debug2: ping_read: Got 262
>> bytes
>>  >  > No error message: -1: Resource temporarily unavailable (11)
>>  >  > pingd[6890]: 2009/07/24_14:44:39 debug: process_icmp_error: No error
>> message: -1: Resource temporarily unavailable (11)
>>  >  > pingd[6890]: 2009/07/24_14:44:39 debug: debug2: dump_v4_echo: Echo
>> from 172.22.10.2 (exp=1081, seq=0, id=0, dest=3.27.60.1, data=E?):
>> Unreachable Port
>>  >  > pingd[6890]: 2009/07/24_14:44:39 info: stand_alone_ping: Node
>> 3.27.60.1 is unreachable (read)
>>  >  > pingd[6890]: 2009/07/24_14:44:40 debug: debug2: ping_write: Sent 39
>> bytes to 3.27.60.1
>>  >  > pingd[6890]: 2009/07/24_14:44:40 debug: debug2: ping_read: Got 262
>> bytes
>>  >  > No error message: -1: Resource temporarily unavailable (11)
>>  >  > pingd[6890]: 2009/07/24_14:44:40 debug: process_icmp_error: No error
>> message: -1: Resource temporarily unavailable (11)
>>  >  > pingd[6890]: 2009/07/24_14:44:40 debug: debug2: dump_v4_echo: Echo
>> from 192.168.0.5 (exp=1082, seq=0, id=0, dest=3.27.60.1, data=E?):
>> Unreachable Port
>>  >  > pingd[6890]: 2009/07/24_14:44:40 info: stand_alone_ping: Node
>> 3.27.60.1 is unreachable (read)
>>  >  > pingd[6890]: 2009/07/24_14:44:41 debug: debug2: ping_close: Closed
>> connection to 3.27.60.1
>>  >  > pingd[6890]: 2009/07/24_14:44:41 debug: send_update: Sent update:
>> pingd=0 (0 active ping nodes)
>>  >  > pingd[6890]: 2009/07/24_14:44:41 debug: debug2: stand_alone_ping:
>> Checking connectivity
>>  >  > pingd[6890]: 2009/07/24_14:44:41 debug: debug2: ping_open: Got
>> address 3.27.60.1 for 3.27.60.1
>>  >  > pingd[6890]: 2009/07/24_14:44:41 debug: debug2: ping_open: Opened
>> connection to 3.27.60.1
>>  >  > pingd[6890]: 2009/07/24_14:44:41 debug: debug2: ping_write: Sent 39
>> bytes to 3.27.60.1
>>  >  > pingd[6890]: 2009/07/24_14:44:41 debug: debug2: ping_read: Got 59
>> bytes
>>  >  > pingd[6890]: 2009/07/24_14:44:41 debug: debug2: dump_v4_echo: Echo
>> from 3.27.60.1 (exp=1083, seq=1083, id=6890, dest=3.27.60.1, data=pingd-v4):
>> Echo Reply
>>  >  > pingd[6890]: 2009/07/24_14:44:41 debug: stand_alone_ping: Node
>> 3.27.60.1 is alive
>>  >  > pingd[6890]: 2009/07/24_14:44:41 debug: debug2: ping_close: Closed
>> connection to 3.27.60.1
>>  >  > pingd[6890]: 2009/07/24_14:44:41 debug: send_update: Sent update:
>> pingd=1000 (1 active ping nodes)
>>  >  >
>>  >  > Thanks
>>  >  > Jarek
>>  >  >
>>  >  > "General Linux-HA mailing list" <[email protected]>
>> napisa?(a):
>>  >  >  > 2009/7/24  <[email protected]>:
>>  >  >  > >
>>  >  >  > > Rpm built for RHEL5:
>>  >  >  > > heartbeat-common-2.99.2-8.1
>>  >  >  > > libheartbeat2-2.99.2-8.1
>>  >  >  > > heartbeat-2.99.2-8.1
>>  >  >  > > heartbeat-resources-2.99.2-8.1
>>  >  >  > > pacemaker-1.0.3-2.2
>>  >  >  > > pacemaker-mgmt-client-1.99.1-2.1
>>  >  >  > > libpacemaker3-1.0.3-2.2
>>  >  >  > > pacemaker-mgmt-1.99.1-2.1
>>  >  >  > >
>>  >  >  > > If i start pingd manually (beside working heartbeat+pacemaker)
>> it gives me following when in /var/log/ha-debug appears "stand_alone_ping:
>> Node xx.yy.zz.ww is unreachable (read)":
>>  >  >  > >
>>  >  >  > > [r...@gate2]# date ;/usr/lib64/heartbeat/pingd -a pingd -d 10
>> -m 1000 -h xx.yy.zz.ww; date
>>  >  >  > > Thu Jul 23 19:25:24 CEST 2009
>>  >  >  > > No error message: -1: Resource temporarily unavailable (11)
>>  >  >  > > No error message: -1: Resource temporarily unavailable (11)
>>  >  >  > > No error message: -1: Resource temporarily unavailable (11)
>>  >  >  > > No error message: -1: Resource temporarily unavailable (11)
>>  >  >  > > No error message: -1: Resource temporarily unavailable (11)
>>  >  >  > > No error message: -1: Resource temporarily unavailable (11)
>>  >  >  > > ...
>>  >  >  > >
>>  >  >  > > System ping reports no errors.
>>  >  >  > >
>>  >  >  >
>>  >  >  > If you repeat that test with some extra -V arguments, you should
>> see
>>  >  >  > more information (which would be helpful).
>>  >  >  > But its pretty clear there must be a bug, so its probably worth
>>  >  >  > creating an entry in bugzilla.
>>  >  >  > _______________________________________________
>>  >  >  > Linux-HA mailing list
>>  >  >  > [email protected]
>>  >  >  > http://lists.linux-ha.org/mailman/listinfo/linux-ha
>>  >  >  > See also: http://linux-ha.org/ReportingProblems
>>  >  >
>>  >  > _______________________________________________
>>  >  > Linux-HA mailing list
>>  >  > [email protected]
>>  >  > http://lists.linux-ha.org/mailman/listinfo/linux-ha
>>  >  > See also: http://linux-ha.org/ReportingProblems
>>  >
>>  > _______________________________________________
>>  > Linux-HA mailing list
>>  > [email protected]
>>  > http://lists.linux-ha.org/mailman/listinfo/linux-ha
>>  > See also: http://linux-ha.org/ReportingProblems
>>
>>
>>
>> ------------------------------
>>
>> Message: 6
>> Date: Tue, 28 Jul 2009 16:55:21 +0200
>> From: Miguel Olivares <[email protected]>
>> Subject: [Linux-HA] HA cluster 2 IP Service and bonding
>> To: [email protected]
>> Message-ID: <[email protected]>
>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>>
>> Hello,
>>
>> I have 2 servers in mi cluster with 4 Ethernet cards each one,  i used
>> bonding in order to have full redundancy. i didn't have any problem with
>> my configuration, it works but not as well as i expect. What i want to
>> do is when one of the two bonding interfaces "bond0" or "bond1" does not
>> respond  in the primary server this servers gives up, and the other
>> server can take the service, but i don't know how to do that. i put my
>> config files ha.cf and haresources.
>>
>> eth0 and eth1 -> bond0 #on Server1 and Server2
>> eth2 and eth3  -> bond1 #on Server1 and Server2
>>
>> 192.168.1.10  Server1 # bond0
>> 192.168.1.11   Server2 # bond0
>> 192.168.1.12  virtual IP1
>>
>> 172.16.10.10 Server1 #bond1
>> 172.16.10 11  Server2 #bond1
>> 172.16.10.12 virtual IP2
>>
>> in my configuration i saw both virtual's IP  in Server1, but when i put
>> down bond1 Server1 still continue as the primary node.
>> does anybody help me
>>
>> Thanks
>>
>> regards.
>>
>>
>> [ha.cf]
>> debugfile /var/log/ha-debug
>> logfile /var/log/ha-log
>> keepalive 2
>> deadtime 30
>> warntime 10
>> initdead 60
>> udpport        694
>> bcast   bond0 bond1
>> auto_failback off
>> node    Server1 Server2
>> ping 192.168.1.254
>> ping 172.16.10.254
>>
>>
>> [haresources]
>> Server1  192.168.1.12   172.16.1.12
>>
>>
>> ------------------------------
>>
>> _______________________________________________
>> Linux-HA mailing list
>> [email protected]
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
>>
>> End of Linux-HA Digest, Vol 68, Issue 63
>> ****************************************
>>
>
>
>
> --
> Regards,
>
> Ahmed Munir
>
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Linux-HA Digest, Vol 68, Issue 63

Reply via email to