Am Montag, 2. März 2009 08:35:51 schrieb Andrew Beekhof:
> On Fri, Feb 27, 2009 at 16:04, Michael Schwartzkopff <[email protected]> 
wrote:
> > Am Freitag, 27. Februar 2009 15:21:34 schrieb Michael Schwartzkopff:
> >> Hi,
> >>
> >> my system: debian lenny, heartbeat 2.99.2-1,  pacemaker 1.0.1-1.
> >>
> >> In ha.cf I have 2 ping nodes:
> >> ping 82.135.103.97 192.168.188.19
> >>
> >> From the command line I can ping both hosts. When I start heartbeat I
> >> see that my node is sending and receiving icmp packets to and from both
> >> hosts.
> >>
> >> Also the log file recognises two ping nodes:
> >> Feb 27 14:13:54 fw4 heartbeat: [24762]: info: Link
> >> 82.135.103.97:82.135.103.97 up.
> >> Feb 27 14:13:54 fw4 heartbeat: [24762]: info: Status update for node
> >> 82.135.103.97: status ping
> >> Feb 27 14:13:54 fw4 heartbeat: [24762]: info: Status update for node
> >> 192.168.188.119: status ping
> >> Feb 27 14:13:54 fw4 heartbeat: [24762]: info: Link
> >> 192.168.188.119:192.168.188.119 up.
> >>
> >> When I start ping manually
> >> /usr/lib/heartbeat/pingd -m100 -d5
> >>
> >> I see that the ndoe only recoginses ONE ping node and according only
> >> gives 100 points:
> >
> > (...)
> >
> >> the problem occures on BOTH nodes of the cluster. The ping node that is
> >> not recognized is the default router. Anybody any idea what went wrong
> >> here? Thank for helping.
> >
> > Hi,
> > playing further with the cluster it seems that it always counts one ping
> > node to few points.
> > I configured 3 ping nodes and I got 200 points.
> > When I disable icmp for one ping node I got 100 points.
> > When I disable icmp for the next host I get 0 points.
> > When I disable icmp for the last pingnode I get -100 points.
> >
> > Is this feature somewhere documented?
>
> Its not a feature.
> Can you add a couple of -V options to your pingd command and attach the
> logs? _______________________________________________

Here you are:

Mar  2 11:31:07 fw4 pingd: [21065]: info: Invoked: /usr/lib/heartbeat/pingd -V 
-V -V -V -V -m 100 -d5s -a pingd
Mar  2 11:31:07 fw4 pingd: [21065]: debug: main: attrd registration attempt: 0
Mar  2 11:31:12 fw4 pingd: [21065]: debug: init_client_ipc_comms_nodispatch: 
Attempting to talk on: /var/run/heartbeat/crm/attrd
Mar  2 11:31:12 fw4 pingd: [21065]: debug: debug3: 
init_client_ipc_comms_nodispatch: Processing of /var/run/heartbeat/crm/attrd 
complete
Mar  2 11:31:12 fw4 pingd: [21065]: debug: register_with_ha: Signing in with 
Heartbeat
Mar  2 11:31:12 fw4 pingd: [21065]: debug: debug2: do_node_walk: Invoked
Mar  2 11:31:12 fw4 pingd: [21065]: debug: debug3: do_node_walk: Requesting an 
initial dump of CRMD client_status
Mar  2 11:31:13 fw4 pingd: [21065]: info: do_node_walk: Requesting the list of 
configured nodes
Mar  2 11:31:13 fw4 pingd: [21065]: debug: do_node_walk: Node fw3: skipping 
'normal'
Mar  2 11:31:13 fw4 pingd: [21065]: debug: do_node_walk: Node fw4: skipping 
'normal'
Mar  2 11:31:14 fw4 pingd: [21065]: debug: do_node_walk: Adding: 
192.168.189.4=ping
Mar  2 11:31:14 fw4 pingd: [21065]: debug: do_node_walk: Adding: 
192.168.188.110=ping
Mar  2 11:31:15 fw4 pingd: [21065]: debug: do_node_walk: Adding: 
82.135.103.97=ping
Mar  2 11:31:15 fw4 pingd: [21065]: debug: debug2: do_node_walk: Complete
Mar  2 11:31:15 fw4 pingd: [21065]: info: send_update: 2 active ping nodes
Mar  2 11:31:15 fw4 pingd: [21065]: debug: debug3: register_with_ha: Be 
informed of Node Status changes
Mar  2 11:31:15 fw4 pingd: [21065]: debug: debug3: register_with_ha: Adding 
channel to mainloop
Mar  2 11:31:15 fw4 pingd: [21065]: info: main: Starting pingd

OK. I started pingd. ha.cf tells pingd about 3 nodes, but it sees only 2 
nodes. Now I do a
iptables -I INPUT -p icmp -s 82.135.103.97 -j DROP to simulate a network 
failure. Syslog detects this and reduces the points by 100:

Mar  2 11:31:49 fw4 pingd: [21065]: debug: debug2: pingd_ha_dispatch: Invoked
Mar  2 11:31:49 fw4 pingd: [21065]: notice: pingd_nstatus_callback: Status 
update: Ping node 82.135.103.97 now has status [dead]
Mar  2 11:31:49 fw4 pingd: [21065]: info: send_update: 1 active ping nodes
Mar  2 11:31:49 fw4 pingd: [21065]: debug: debug2: pingd_ha_dispatch: no 
message ready yet
Mar  2 11:31:49 fw4 heartbeat: [20957]: WARN: node 82.135.103.97: is dead
Mar  2 11:31:49 fw4 crmd: [20979]: notice: crmd_ha_status_callback: Status 
update: Node 82.135.103.97 now has status [dead] (DC=false)
Mar  2 11:31:49 fw4 crmd: [20979]: WARN: get_uuid: Could not calculate UUID 
for 82.135.103.97
Mar  2 11:31:54 fw4 attrd: [20978]: info: attrd_trigger_update: Sending flush 
op to all hosts for: pingd
Mar  2 11:31:54 fw4 attrd: [20978]: info: attrd_ha_callback: flush message from 
fw4
Mar  2 11:31:54 fw4 cib: [20975]: info: cib_process_xpath: Processing 
cib_query op for //cib/status//node_sta...@id='cbb68cf2-594f-4775-a604-
ded2f6aa08a5']//nvpa...@name='pingd'] 
(/cib/status/node_state[2]/transient_attributes/instance_attributes/nvpair[2])
Mar  2 11:31:54 fw4 attrd: [20978]: info: attrd_perform_update: Sent update 
19: pingd=100

I remove the iptables rule and the cluster can see the ping node again:
Mar  2 11:32:01 fw4 pingd: [21065]: debug: debug2: pingd_ha_dispatch: Invoked
Mar  2 11:32:01 fw4 pingd: [21065]: notice: pingd_nstatus_callback: Status 
update: Ping node 82.135.103.97 now has status [ping]
Mar  2 11:32:01 fw4 pingd: [21065]: info: send_update: 2 active ping nodes
Mar  2 11:32:01 fw4 pingd: [21065]: debug: debug2: pingd_ha_dispatch: no 
message ready yet
Mar  2 11:32:01 fw4 heartbeat: [20957]: WARN: Late heartbeat: Node 
82.135.103.97: interval 18000 ms
Mar  2 11:32:01 fw4 heartbeat: [20957]: info: Status update for node 
82.135.103.97: status ping
Mar  2 11:32:01 fw4 crmd: [20979]: notice: crmd_ha_status_callback: Status 
update: Node 82.135.103.97 now has status [ping] (DC=false)
Mar  2 11:32:01 fw4 crmd: [20979]: WARN: get_uuid: Could not calculate UUID 
for 82.135.103.97
Mar  2 11:32:06 fw4 attrd: [20978]: info: attrd_trigger_update: Sending flush 
op to all hosts for: pingd
Mar  2 11:32:06 fw4 cib: [20975]: info: cib_process_xpath: Processing 
cib_query op for //cib/status//node_sta...@id='cbb68cf2-594f-4775-a604-
ded2f6aa08a5']//nvpa...@name='pingd'] 
(/cib/status/node_state[2]/transient_attributes/instance_attributes/nvpair[2])
Mar  2 11:32:06 fw4 attrd: [20978]: info: attrd_ha_callback: flush message from 
fw4
Mar  2 11:32:06 fw4 attrd: [20978]: info: attrd_perform_update: Sent update 
21: pingd=200

The same applies for the other tow ping nodes.


-- 
Dr. Michael Schwartzkopff
MultiNET Services GmbH
Addresse: Bretonischer Ring 7; 85630 Grasbrunn; Germany
Tel: +49 - 89 - 45 69 11 0
Fax: +49 - 89 - 45 69 11 21
mob: +49 - 174 - 343 28 75

mail: [email protected]
web: www.multinet.de

Sitz der Gesellschaft: 85630 Grasbrunn
Registergericht: Amtsgericht München HRB 114375
Geschäftsführer: Günter Jurgeneit, Hubert Martens

---

PGP Fingerprint: F919 3919 FF12 ED5A 2801 DEA6 AA77 57A4 EDD8 979B
Skype: misch42
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to