Am Montag, 2. März 2009 08:35:51 schrieb Andrew Beekhof: > On Fri, Feb 27, 2009 at 16:04, Michael Schwartzkopff <[email protected]> wrote: > > Am Freitag, 27. Februar 2009 15:21:34 schrieb Michael Schwartzkopff: > >> Hi, > >> > >> my system: debian lenny, heartbeat 2.99.2-1, pacemaker 1.0.1-1. > >> > >> In ha.cf I have 2 ping nodes: > >> ping 82.135.103.97 192.168.188.19 > >> > >> From the command line I can ping both hosts. When I start heartbeat I > >> see that my node is sending and receiving icmp packets to and from both > >> hosts. > >> > >> Also the log file recognises two ping nodes: > >> Feb 27 14:13:54 fw4 heartbeat: [24762]: info: Link > >> 82.135.103.97:82.135.103.97 up. > >> Feb 27 14:13:54 fw4 heartbeat: [24762]: info: Status update for node > >> 82.135.103.97: status ping > >> Feb 27 14:13:54 fw4 heartbeat: [24762]: info: Status update for node > >> 192.168.188.119: status ping > >> Feb 27 14:13:54 fw4 heartbeat: [24762]: info: Link > >> 192.168.188.119:192.168.188.119 up. > >> > >> When I start ping manually > >> /usr/lib/heartbeat/pingd -m100 -d5 > >> > >> I see that the ndoe only recoginses ONE ping node and according only > >> gives 100 points: > > > > (...) > > > >> the problem occures on BOTH nodes of the cluster. The ping node that is > >> not recognized is the default router. Anybody any idea what went wrong > >> here? Thank for helping. > > > > Hi, > > playing further with the cluster it seems that it always counts one ping > > node to few points. > > I configured 3 ping nodes and I got 200 points. > > When I disable icmp for one ping node I got 100 points. > > When I disable icmp for the next host I get 0 points. > > When I disable icmp for the last pingnode I get -100 points. > > > > Is this feature somewhere documented? > > Its not a feature. > Can you add a couple of -V options to your pingd command and attach the > logs? _______________________________________________
Here you are: Mar 2 11:31:07 fw4 pingd: [21065]: info: Invoked: /usr/lib/heartbeat/pingd -V -V -V -V -V -m 100 -d5s -a pingd Mar 2 11:31:07 fw4 pingd: [21065]: debug: main: attrd registration attempt: 0 Mar 2 11:31:12 fw4 pingd: [21065]: debug: init_client_ipc_comms_nodispatch: Attempting to talk on: /var/run/heartbeat/crm/attrd Mar 2 11:31:12 fw4 pingd: [21065]: debug: debug3: init_client_ipc_comms_nodispatch: Processing of /var/run/heartbeat/crm/attrd complete Mar 2 11:31:12 fw4 pingd: [21065]: debug: register_with_ha: Signing in with Heartbeat Mar 2 11:31:12 fw4 pingd: [21065]: debug: debug2: do_node_walk: Invoked Mar 2 11:31:12 fw4 pingd: [21065]: debug: debug3: do_node_walk: Requesting an initial dump of CRMD client_status Mar 2 11:31:13 fw4 pingd: [21065]: info: do_node_walk: Requesting the list of configured nodes Mar 2 11:31:13 fw4 pingd: [21065]: debug: do_node_walk: Node fw3: skipping 'normal' Mar 2 11:31:13 fw4 pingd: [21065]: debug: do_node_walk: Node fw4: skipping 'normal' Mar 2 11:31:14 fw4 pingd: [21065]: debug: do_node_walk: Adding: 192.168.189.4=ping Mar 2 11:31:14 fw4 pingd: [21065]: debug: do_node_walk: Adding: 192.168.188.110=ping Mar 2 11:31:15 fw4 pingd: [21065]: debug: do_node_walk: Adding: 82.135.103.97=ping Mar 2 11:31:15 fw4 pingd: [21065]: debug: debug2: do_node_walk: Complete Mar 2 11:31:15 fw4 pingd: [21065]: info: send_update: 2 active ping nodes Mar 2 11:31:15 fw4 pingd: [21065]: debug: debug3: register_with_ha: Be informed of Node Status changes Mar 2 11:31:15 fw4 pingd: [21065]: debug: debug3: register_with_ha: Adding channel to mainloop Mar 2 11:31:15 fw4 pingd: [21065]: info: main: Starting pingd OK. I started pingd. ha.cf tells pingd about 3 nodes, but it sees only 2 nodes. Now I do a iptables -I INPUT -p icmp -s 82.135.103.97 -j DROP to simulate a network failure. Syslog detects this and reduces the points by 100: Mar 2 11:31:49 fw4 pingd: [21065]: debug: debug2: pingd_ha_dispatch: Invoked Mar 2 11:31:49 fw4 pingd: [21065]: notice: pingd_nstatus_callback: Status update: Ping node 82.135.103.97 now has status [dead] Mar 2 11:31:49 fw4 pingd: [21065]: info: send_update: 1 active ping nodes Mar 2 11:31:49 fw4 pingd: [21065]: debug: debug2: pingd_ha_dispatch: no message ready yet Mar 2 11:31:49 fw4 heartbeat: [20957]: WARN: node 82.135.103.97: is dead Mar 2 11:31:49 fw4 crmd: [20979]: notice: crmd_ha_status_callback: Status update: Node 82.135.103.97 now has status [dead] (DC=false) Mar 2 11:31:49 fw4 crmd: [20979]: WARN: get_uuid: Could not calculate UUID for 82.135.103.97 Mar 2 11:31:54 fw4 attrd: [20978]: info: attrd_trigger_update: Sending flush op to all hosts for: pingd Mar 2 11:31:54 fw4 attrd: [20978]: info: attrd_ha_callback: flush message from fw4 Mar 2 11:31:54 fw4 cib: [20975]: info: cib_process_xpath: Processing cib_query op for //cib/status//node_sta...@id='cbb68cf2-594f-4775-a604- ded2f6aa08a5']//nvpa...@name='pingd'] (/cib/status/node_state[2]/transient_attributes/instance_attributes/nvpair[2]) Mar 2 11:31:54 fw4 attrd: [20978]: info: attrd_perform_update: Sent update 19: pingd=100 I remove the iptables rule and the cluster can see the ping node again: Mar 2 11:32:01 fw4 pingd: [21065]: debug: debug2: pingd_ha_dispatch: Invoked Mar 2 11:32:01 fw4 pingd: [21065]: notice: pingd_nstatus_callback: Status update: Ping node 82.135.103.97 now has status [ping] Mar 2 11:32:01 fw4 pingd: [21065]: info: send_update: 2 active ping nodes Mar 2 11:32:01 fw4 pingd: [21065]: debug: debug2: pingd_ha_dispatch: no message ready yet Mar 2 11:32:01 fw4 heartbeat: [20957]: WARN: Late heartbeat: Node 82.135.103.97: interval 18000 ms Mar 2 11:32:01 fw4 heartbeat: [20957]: info: Status update for node 82.135.103.97: status ping Mar 2 11:32:01 fw4 crmd: [20979]: notice: crmd_ha_status_callback: Status update: Node 82.135.103.97 now has status [ping] (DC=false) Mar 2 11:32:01 fw4 crmd: [20979]: WARN: get_uuid: Could not calculate UUID for 82.135.103.97 Mar 2 11:32:06 fw4 attrd: [20978]: info: attrd_trigger_update: Sending flush op to all hosts for: pingd Mar 2 11:32:06 fw4 cib: [20975]: info: cib_process_xpath: Processing cib_query op for //cib/status//node_sta...@id='cbb68cf2-594f-4775-a604- ded2f6aa08a5']//nvpa...@name='pingd'] (/cib/status/node_state[2]/transient_attributes/instance_attributes/nvpair[2]) Mar 2 11:32:06 fw4 attrd: [20978]: info: attrd_ha_callback: flush message from fw4 Mar 2 11:32:06 fw4 attrd: [20978]: info: attrd_perform_update: Sent update 21: pingd=200 The same applies for the other tow ping nodes. -- Dr. Michael Schwartzkopff MultiNET Services GmbH Addresse: Bretonischer Ring 7; 85630 Grasbrunn; Germany Tel: +49 - 89 - 45 69 11 0 Fax: +49 - 89 - 45 69 11 21 mob: +49 - 174 - 343 28 75 mail: [email protected] web: www.multinet.de Sitz der Gesellschaft: 85630 Grasbrunn Registergericht: Amtsgericht München HRB 114375 Geschäftsführer: Günter Jurgeneit, Hubert Martens --- PGP Fingerprint: F919 3919 FF12 ED5A 2801 DEA6 AA77 57A4 EDD8 979B Skype: misch42 _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
