hi... i get starting with heartbeat.. but i cant see how is wrong in my configuration (v1 style for now)...
for testing porpouse i want to have ha for apache server if active/pasive mode. i'm using heartbeat 2.1.3-3 installed by rpm's, and i have these settings for my network: 100.0.4.100 | 100.0.4.145/255.0.0.0 | 100.0.4.180/255.0.0.0 nodo1<----------------------------------------------------------------------------------------------> nodo2 ^ ^ |192.168.140.1/255.255.255.0 192.168.140.2/255.255.255.0| |_______________________________ ______________________________| heartbeat communication channel where: - 100.0.4.100 is a virtual ip - the heartbeat communication channel is made with a ethernet crossover cable (i try to connect to switch whit the same problem). - the other connections are doing by a switch. - nodo1 as the master node (maximatt) - nodo2 as the backup node (einstein) i test connections between node and i they are ok... believe me.. ;) so... i do this acctions for testing: 1) start up master node, i they come to offered services ok (tested via webbrowser), and these are the master log: heartbeat[8444]: 2008/06/10_11:38:06 info: Version 2 support: false heartbeat[8444]: 2008/06/10_11:38:06 WARN: Logging daemon is disabled --enabling logging daemon is recommended heartbeat[8444]: 2008/06/10_11:38:06 info: ************************** heartbeat[8444]: 2008/06/10_11:38:06 info: Configuration validated. Starting heartbeat 2.1.3 heartbeat[8445]: 2008/06/10_11:38:06 info: heartbeat: version 2.1.3 heartbeat[8445]: 2008/06/10_11:38:06 info: Heartbeat generation: 1207833064 heartbeat[8445]: 2008/06/10_11:38:06 info: glib: UDP Broadcast heartbeat started on port 694 (694) interface dev20603 heartbeat[8445]: 2008/06/10_11:38:06 info: glib: UDP Broadcast heartbeat closed on port 694 interface dev20603 - Status: 1 heartbeat[8445]: 2008/06/10_11:38:06 info: G_main_add_TriggerHandler: Added signal manual handler heartbeat[8445]: 2008/06/10_11:38:06 info: G_main_add_TriggerHandler: Added signal manual handler heartbeat[8445]: 2008/06/10_11:38:06 info: G_main_add_SignalHandler: Added signal handler for signal 17 heartbeat[8445]: 2008/06/10_11:38:06 info: Local status now set to: 'up' heartbeat[8445]: 2008/06/10_11:38:07 info: Link maximatt.prueba.uy:dev20603 up. heartbeat[8445]: 2008/06/10_11:40:06 WARN: node einstein.prueba.uy: is dead heartbeat[8445]: 2008/06/10_11:40:06 info: Comm_now_up(): updating status to active heartbeat[8445]: 2008/06/10_11:40:06 info: Local status now set to: 'active' heartbeat[8445]: 2008/06/10_11:40:06 info: Starting child client "/usr/lib/heartbeat/ipfail" (498,496) heartbeat[8445]: 2008/06/10_11:40:06 WARN: No STONITH device configured. heartbeat[8445]: 2008/06/10_11:40:06 WARN: Shared disks are not protected. heartbeat[8445]: 2008/06/10_11:40:06 info: Resources being acquired from einstein.prueba.uy. heartbeat[8477]: 2008/06/10_11:40:06 info: Starting "/usr/lib/heartbeat/ipfail" as uid 498 gid 496 (pid 8477) harc[8478]: 2008/06/10_11:40:06 info: Running /etc/ha.d/rc.d/status status mach_down[8507]: 2008/06/10_11:40:06 info: /usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired mach_down[8507]: 2008/06/10_11:40:06 info: mach_down takeover complete for node einstein.prueba.uy. heartbeat[8445]: 2008/06/10_11:40:06 info: mach_down takeover complete. heartbeat[8445]: 2008/06/10_11:40:06 info: Initial resource acquisition complete (mach_down) IPaddr[8554]: 2008/06/10_11:40:06 INFO: Resource is stopped heartbeat[8479]: 2008/06/10_11:40:06 info: Local Resource acquisition completed. harc[8603]: 2008/06/10_11:40:06 info: Running /etc/ha.d/rc.d/ip-request-resp ip-request-resp ip-request-resp[8603]: 2008/06/10_11:40:06 received ip-request-resp 100.0.4.100 OK yes ResourceManager[8624]: 2008/06/10_11:40:06 info: Acquiring resource group: maximatt.prueba.uy 100.0.4.100 httpd IPaddr[8651]: 2008/06/10_11:40:07 INFO: Resource is stopped ResourceManager[8624]: 2008/06/10_11:40:07 info: Running /etc/ha.d/resource.d/IPaddr 100.0.4.100 start IPaddr[8727]: 2008/06/10_11:40:07 INFO: Using calculated nic for 100.0.4.100: eth1 IPaddr[8727]: 2008/06/10_11:40:07 INFO: Using calculated netmask for 100.0.4.100: 255.0.0.0 IPaddr[8727]: 2008/06/10_11:40:07 INFO: eval ifconfig eth1:0 100.0.4.100netmask 255.0.0.0 broadcast 100.255.255.255 IPaddr[8710]: 2008/06/10_11:40:07 INFO: Success ResourceManager[8624]: 2008/06/10_11:40:07 info: Running /etc/init.d/httpd start heartbeat[8445]: 2008/06/10_11:40:17 info: Local Resource acquisition completed. (none) heartbeat[8445]: 2008/06/10_11:40:17 info: local resource transition completed. 2) i start the backup node... and in a few seconds, the backup node start to serve services :( the concecuences of that is that i have two node offering services... but i not cofigure these... these are the master log (these are a continue that log paste above): heartbeat[8445]: 2008/06/10_11:42:17 info: Link einstein.prueba.uy:dev20603 up. heartbeat[8445]: 2008/06/10_11:42:17 info: Status update for node einstein.prueba.uy: status init heartbeat[8445]: 2008/06/10_11:42:17 info: Status update for node einstein.prueba.uy: status up ipfail[8477]: 2008/06/10_11:42:17 info: Link Status update: Link einstein.prueba.uy/dev20603 now has status up ipfail[8477]: 2008/06/10_11:42:17 info: Status update: Node einstein.prueba.uy now has status init ipfail[8477]: 2008/06/10_11:42:17 info: Status update: Node einstein.prueba.uy now has status up harc[8895]: 2008/06/10_11:42:17 info: Running /etc/ha.d/rc.d/status status harc[8912]: 2008/06/10_11:42:17 info: Running /etc/ha.d/rc.d/status status heartbeat[8445]: 2008/06/10_11:42:18 info: all clients are now paused heartbeat[8445]: 2008/06/10_11:44:16 WARN: 1 lost packet(s) for [ einstein.prueba.uy] [124:126] heartbeat[8445]: 2008/06/10_11:44:16 info: Status update for node einstein.prueba.uy: status active heartbeat[8445]: 2008/06/10_11:44:16 info: No pkts missing from einstein.prueba.uy! heartbeat[8445]: 2008/06/10_11:44:16 info: remote resource transition completed. heartbeat[8445]: 2008/06/10_11:44:16 ERROR: Both machines own our resources! heartbeat[8445]: 2008/06/10_11:44:16 ERROR: Both machines own foreign resources! heartbeat[8445]: 2008/06/10_11:44:16 info: maximatt.prueba.uy wants to go standby [foreign] heartbeat[8445]: 2008/06/10_11:44:16 ERROR: Both machines own our resources! heartbeat[8445]: 2008/06/10_11:44:16 ERROR: Both machines own foreign resources! ipfail[8477]: 2008/06/10_11:44:16 info: Status update: Node einstein.prueba.uy now has status active harc[8933]: 2008/06/10_11:44:17 info: Running /etc/ha.d/rc.d/status status heartbeat[8445]: 2008/06/10_11:44:18 ERROR: Both machines own our resources! heartbeat[8445]: 2008/06/10_11:44:18 ERROR: Both machines own foreign resources! heartbeat[8445]: 2008/06/10_11:44:19 WARN: Message hist queue is filling up (376 messages in queue) heartbeat[8445]: 2008/06/10_11:44:20 WARN: Message hist queue is filling up (377 messages in queue) heartbeat[8445]: 2008/06/10_11:44:21 WARN: Message hist queue is filling up (378 messages in queue) heartbeat[8445]: 2008/06/10_11:44:22 WARN: Message hist queue is filling up (379 messages in queue) heartbeat[8445]: 2008/06/10_11:44:23 WARN: Message hist queue is filling up (380 messages in queue) heartbeat[8445]: 2008/06/10_11:44:24 WARN: Message hist queue is filling up (381 messages in queue) heartbeat[8445]: 2008/06/10_11:44:25 WARN: Message hist queue is filling up (382 messages in queue) heartbeat[8445]: 2008/06/10_11:44:26 WARN: Message hist queue is filling up (383 messages in queue) heartbeat[8445]: 2008/06/10_11:44:26 WARN: No reply to standby request. Standby request cancelled. heartbeat[8445]: 2008/06/10_11:44:27 ERROR: Both machines own our resources! heartbeat[8445]: 2008/06/10_11:44:27 ERROR: Both machines own foreign resources! these are the log for backup node: heartbeat[6105]: 2008/06/10_11:53:58 info: Version 2 support: false heartbeat[6105]: 2008/06/10_11:53:58 WARN: Logging daemon is disabled --enabling logging daemon is recommended heartbeat[6105]: 2008/06/10_11:53:58 info: ************************** heartbeat[6105]: 2008/06/10_11:53:58 info: Configuration validated. Starting heartbeat 2.1.3 heartbeat[6106]: 2008/06/10_11:53:58 info: heartbeat: version 2.1.3 heartbeat[6106]: 2008/06/10_11:53:58 info: Heartbeat generation: 1207843592 heartbeat[6106]: 2008/06/10_11:53:58 info: glib: UDP Broadcast heartbeat started on port 694 (694) interface eth0 heartbeat[6106]: 2008/06/10_11:53:58 info: glib: UDP Broadcast heartbeat closed on port 694 interface eth0 - Status: 1 heartbeat[6106]: 2008/06/10_11:53:58 info: G_main_add_TriggerHandler: Added signal manual handler heartbeat[6106]: 2008/06/10_11:53:58 info: G_main_add_TriggerHandler: Added signal manual handler heartbeat[6106]: 2008/06/10_11:53:58 info: G_main_add_SignalHandler: Added signal handler for signal 17 heartbeat[6106]: 2008/06/10_11:53:58 info: Local status now set to: 'up' *heartbeat[6106]: 2008/06/10_11:55:59 WARN: node maximatt.prueba.uy: is dead * heartbeat[6106]: 2008/06/10_11:55:59 info: Comm_now_up(): updating status to active heartbeat[6106]: 2008/06/10_11:55:59 info: Local status now set to: 'active' heartbeat[6106]: 2008/06/10_11:55:59 info: Starting child client "/usr/lib/heartbeat/ipfail" (498,496) heartbeat[6106]: 2008/06/10_11:55:59 WARN: No STONITH device configured. heartbeat[6106]: 2008/06/10_11:55:59 WARN: Shared disks are not protected. heartbeat[6106]: 2008/06/10_11:55:59 info: Resources being acquired from maximatt.prueba.uy. heartbeat[6116]: 2008/06/10_11:55:59 info: Starting "/usr/lib/heartbeat/ipfail" as uid 498 gid 496 (pid 6116) heartbeat[6118]: 2008/06/10_11:55:59 info: No local resources [/usr/share/heartbeat/ResourceManager listkeys einstein.prueba.uy] to acquire. harc[6117]: 2008/06/10_11:55:59 info: Running /etc/ha.d/rc.d/status status mach_down[6146]: 2008/06/10_11:55:59 info: Taking over resource group 100.0.4.100 ResourceManager[6172]: 2008/06/10_11:55:59 info: Acquiring resource group: maximatt.prueba.uy 100.0.4.100 httpd IPaddr[6199]: 2008/06/10_11:55:59 INFO: Resource is stopped ResourceManager[6172]: 2008/06/10_11:55:59 info: Running /etc/ha.d/resource.d/IPaddr 100.0.4.100 start IPaddr[6275]: 2008/06/10_11:55:59 INFO: Using calculated nic for 100.0.4.100: eth1 IPaddr[6275]: 2008/06/10_11:55:59 INFO: Using calculated netmask for 100.0.4.100: 255.0.0.0 IPaddr[6275]: 2008/06/10_11:55:59 INFO: eval ifconfig eth1:0 100.0.4.100netmask 255.0.0.0 broadcast 100.255.255.255 IPaddr[6258]: 2008/06/10_11:55:59 INFO: Success ResourceManager[6172]: 2008/06/10_11:55:59 info: Running /etc/init.d/httpd start mach_down[6146]: 2008/06/10_11:56:01 info: /usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired mach_down[6146]: 2008/06/10_11:56:01 info: mach_down takeover complete for node maximatt.prueba.uy. heartbeat[6106]: 2008/06/10_11:56:01 info: mach_down takeover complete. heartbeat[6106]: 2008/06/10_11:56:01 info: Initial resource acquisition complete (mach_down) heartbeat[6106]: 2008/06/10_11:56:09 info: Local Resource acquisition completed. (none) heartbeat[6106]: 2008/06/10_11:56:09 info: local resource transition completed. these are my configurations files: ha.cf (master node): debugfile /var/log/ha-debug logfile /var/log/ha-log logfacility local0 keepalive 1 deadtime 30 initdead 120 udpport 694 bcast dev20603 auto_failback on node maximatt.prueba.uy node einstein.prueba.uy respawn hacluster /usr/lib/heartbeat/ipfail apiauth ipfail gid=haclient uid=hacluster ha.cf (backup node): debugfile /var/log/ha-debug logfile /var/log/ha-log logfacility local0 keepalive 1 deadtime 30 initdead 120 udpport 694 bcast eth0 auto_failback on node maximatt.prueba.uy node einstein.prueba.uy respawn hacluster /usr/lib/heartbeat/ipfail apiauth ipfail gid=haclient uid=hacluster haresources: maximatt.prueba.uy 100.0.4.100 httpd i cant understant why einstein see maximatt as dead (but not maximatt to einstein)... i test connection and are ok :( ¿what's could be happend? ¿any suggestion? (i have few days with this issue) thanks in advance!!! Salu2 ;)
_______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems