Re: [Linux-HA] problem with ipfail pingd

Johan Hoeke Sat, 29 Dec 2007 16:58:47 -0800

holgi wrote:
> Hi,
> i am testing a migration from heartbeat v1. to heartbeat v2.
> In V2 ipfail e.g. pingd don't work as i expected.
> Following is my ha.cf:
> 
> debugfile /var/log/ha-debug
> logfile /var/log/ha-log
> logfacility     local0
> keepalive 2
> deadtime 30
> warntime 10
> initdead 90
> udpport 694
> bcast   eth0 eth1     auto_failback on
> watchdog /dev/watchdog
> node master
> node slave
> #apiauth        ping gid=root uid=root
> #respawn root /usr/lib/heartbeat/pingd -m 100 -d 5s
> respawn hacluster /usr/lib/heartbeat/ipfail
> ping 192.168.1.1
> crm     yes
> 
> since you can see, i tried it with both pingd and ipfail.
> 
> This is a cut-off the ha-debug logfile
> 
> startup:
> heartbeat[21610]: 2007/12/29_18:21:44 info: glib: ping heartbeat started.
> heartbeat[21610]: 2007/12/29_18:21:44 notice: Using watchdog device:
> /dev/watchdog
> heartbeat[21610]: 2007/12/29_18:21:44 info: G_main_add_SignalHandler:
> Added signal handler for signal 17
> heartbeat[21610]: 2007/12/29_18:21:45 info: Local status now set to: 'up'
> heartbeat[21610]: 2007/12/29_18:21:46 info: Link master:eth0 up.
> heartbeat[21610]: 2007/12/29_18:21:46 info: Link master:eth1 up.
> heartbeat[21610]: 2007/12/29_18:21:46 info: Link 192.168.1.1:192.168.1.1
> up.
> heartbeat[21610]: 2007/12/29_18:21:46 info: Status update for node
> 192.168.1.1: status ping
> heartbeat[21610]: 2007/12/29_18:22:07 info: Link slave:eth0 up.
> heartbeat[21610]: 2007/12/29_18:22:07 info: Status update for node
> slave: status up
> heartbeat[21610]: 2007/12/29_18:22:07 info: Link slave:eth1 up.
> 
> cutting the cabel:
> heartbeat[21610]: 2007/12/29_18:25:35 WARN: node 192.168.1.1: is dead
> heartbeat[21610]: 2007/12/29_18:25:35 info: Link slave:eth0 dead.
> heartbeat[21610]: 2007/12/29_18:25:35 info: Link 192.168.1.1:192.168.1.1
> dead.
> crmd[21627]: 2007/12/29_18:25:35 notice: crmd_ha_status_callback: Status
> update: Node 192.168.1.1 now has status [dead]
> crmd[21627]: 2007/12/29_18:25:36 WARN: get_uuid: Could not calculate
> UUID for 192.168.1.1
> crmd[21627]: 2007/12/29_18:25:36 info: crmd_ha_status_callback: Ping
> node 192.168.1.1 is dead
> 
> but nothing else happens....
> 
> Under Version V1, node2 (slave) graps all resources as expected.
> 
> Please, can someone be so kind and point that out for me ?
> 
> with kind regards
> holgi
>


Hi Holgi,

Are you using a contstraint with pingd or ipfail?

From the pingd page: http://www.linux-ha.org/pingd

"Both methods also require the addition of one-or-more colocation
constraints to the CIB. See "Using pingd Output in Location Constraints"
below."

I'm not an expert by any means, but have used the example code below for
my hbaping stuff.

<rsc_location id="my_resource:connected" rsc="my_resource">
  <rule id="my_resource:connected:rule" score="-INFINITY" boolean_op="or">
    <expression id="my_resource:connected:expr:undefined"
      attribute="pingd" operation="not_defined"/>
    <expression id="my_resource:connected:expr:zero"
      attribute="pingd" operation="lte" value="0"/>
  </rule>
</rsc_location>

That works fine, services fail over to the other node if connection is lost.

If you are using a constraint, post the output of the cibadmin -Ql as
well, so folks on the list can help figure out what's wrong.

cheers,
Johan

signature.asc
Description: OpenPGP digital signature

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] problem with ipfail pingd

Reply via email to