hi...
i get starting with heartbeat.. but i cant see how is wrong in my
configuration (v1 style for now)...
for testing porpouse i want to have ha for apache server if active/pasive
mode.
i'm using heartbeat 2.1.3-3 installed by rpm's, and i have these settings
for my network:
100.0.4.100
|
100.0.4.145/255.0.0.0 | 100.0.4.180/255.0.0.0
nodo1<---------------------------------------------------------------------------------------------->
nodo2
^
^
|192.168.140.1/255.255.255.0
192.168.140.2/255.255.255.0|
|_______________________________ ______________________________|
heartbeat communication channel
where: - 100.0.4.100 is a virtual ip
- the heartbeat communication channel is made with a ethernet
crossover cable (i try to connect to switch whit the same problem).
- the other connections are doing by a switch.
- nodo1 as the master node (maximatt)
- nodo2 as the backup node (einstein)
i test connections between node and i they are ok... believe me.. ;)
so... i do this acctions for testing:
1) start up master node, i they come to offered services ok (tested via
webbrowser), and these are the master log:
heartbeat[8444]: 2008/06/10_11:38:06 info: Version 2 support: false
heartbeat[8444]: 2008/06/10_11:38:06 WARN: Logging daemon is disabled
--enabling logging daemon is recommended
heartbeat[8444]: 2008/06/10_11:38:06 info: **************************
heartbeat[8444]: 2008/06/10_11:38:06 info: Configuration validated. Starting
heartbeat 2.1.3
heartbeat[8445]: 2008/06/10_11:38:06 info: heartbeat: version 2.1.3
heartbeat[8445]: 2008/06/10_11:38:06 info: Heartbeat generation: 1207833064
heartbeat[8445]: 2008/06/10_11:38:06 info: glib: UDP Broadcast heartbeat
started on port 694 (694) interface dev20603
heartbeat[8445]: 2008/06/10_11:38:06 info: glib: UDP Broadcast heartbeat
closed on port 694 interface dev20603 - Status: 1
heartbeat[8445]: 2008/06/10_11:38:06 info: G_main_add_TriggerHandler: Added
signal manual handler
heartbeat[8445]: 2008/06/10_11:38:06 info: G_main_add_TriggerHandler: Added
signal manual handler
heartbeat[8445]: 2008/06/10_11:38:06 info: G_main_add_SignalHandler: Added
signal handler for signal 17
heartbeat[8445]: 2008/06/10_11:38:06 info: Local status now set to: 'up'
heartbeat[8445]: 2008/06/10_11:38:07 info: Link maximatt.prueba.uy:dev20603
up.
heartbeat[8445]: 2008/06/10_11:40:06 WARN: node einstein.prueba.uy: is dead
heartbeat[8445]: 2008/06/10_11:40:06 info: Comm_now_up(): updating status to
active
heartbeat[8445]: 2008/06/10_11:40:06 info: Local status now set to: 'active'
heartbeat[8445]: 2008/06/10_11:40:06 info: Starting child client
"/usr/lib/heartbeat/ipfail" (498,496)
heartbeat[8445]: 2008/06/10_11:40:06 WARN: No STONITH device configured.
heartbeat[8445]: 2008/06/10_11:40:06 WARN: Shared disks are not protected.
heartbeat[8445]: 2008/06/10_11:40:06 info: Resources being acquired from
einstein.prueba.uy.
heartbeat[8477]: 2008/06/10_11:40:06 info: Starting
"/usr/lib/heartbeat/ipfail" as uid 498 gid 496 (pid 8477)
harc[8478]: 2008/06/10_11:40:06 info: Running /etc/ha.d/rc.d/status
status
mach_down[8507]: 2008/06/10_11:40:06 info:
/usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired
mach_down[8507]: 2008/06/10_11:40:06 info: mach_down takeover
complete for node einstein.prueba.uy.
heartbeat[8445]: 2008/06/10_11:40:06 info: mach_down takeover complete.
heartbeat[8445]: 2008/06/10_11:40:06 info: Initial resource acquisition
complete (mach_down)
IPaddr[8554]: 2008/06/10_11:40:06 INFO: Resource is stopped
heartbeat[8479]: 2008/06/10_11:40:06 info: Local Resource acquisition
completed.
harc[8603]: 2008/06/10_11:40:06 info: Running
/etc/ha.d/rc.d/ip-request-resp ip-request-resp
ip-request-resp[8603]: 2008/06/10_11:40:06 received ip-request-resp
100.0.4.100 OK yes
ResourceManager[8624]: 2008/06/10_11:40:06 info: Acquiring resource group:
maximatt.prueba.uy 100.0.4.100 httpd
IPaddr[8651]: 2008/06/10_11:40:07 INFO: Resource is stopped
ResourceManager[8624]: 2008/06/10_11:40:07 info: Running
/etc/ha.d/resource.d/IPaddr 100.0.4.100 start
IPaddr[8727]: 2008/06/10_11:40:07 INFO: Using calculated nic for
100.0.4.100: eth1
IPaddr[8727]: 2008/06/10_11:40:07 INFO: Using calculated netmask for
100.0.4.100: 255.0.0.0
IPaddr[8727]: 2008/06/10_11:40:07 INFO: eval ifconfig eth1:0
100.0.4.100netmask
255.0.0.0 broadcast 100.255.255.255
IPaddr[8710]: 2008/06/10_11:40:07 INFO: Success
ResourceManager[8624]: 2008/06/10_11:40:07 info: Running /etc/init.d/httpd
start
heartbeat[8445]: 2008/06/10_11:40:17 info: Local Resource acquisition
completed. (none)
heartbeat[8445]: 2008/06/10_11:40:17 info: local resource transition
completed.
2) i start the backup node... and in a few seconds, the backup node start to
serve services :(
the concecuences of that is that i have two node offering services... but
i not cofigure these...
these are the master log (these are a continue that log paste above):
heartbeat[8445]: 2008/06/10_11:42:17 info: Link einstein.prueba.uy:dev20603
up.
heartbeat[8445]: 2008/06/10_11:42:17 info: Status update for node
einstein.prueba.uy: status init
heartbeat[8445]: 2008/06/10_11:42:17 info: Status update for node
einstein.prueba.uy: status up
ipfail[8477]: 2008/06/10_11:42:17 info: Link Status update: Link
einstein.prueba.uy/dev20603 now has status up
ipfail[8477]: 2008/06/10_11:42:17 info: Status update: Node
einstein.prueba.uy now has status init
ipfail[8477]: 2008/06/10_11:42:17 info: Status update: Node
einstein.prueba.uy now has status up
harc[8895]: 2008/06/10_11:42:17 info: Running /etc/ha.d/rc.d/status
status
harc[8912]: 2008/06/10_11:42:17 info: Running /etc/ha.d/rc.d/status
status
heartbeat[8445]: 2008/06/10_11:42:18 info: all clients are now paused
heartbeat[8445]: 2008/06/10_11:44:16 WARN: 1 lost packet(s) for [
einstein.prueba.uy] [124:126]
heartbeat[8445]: 2008/06/10_11:44:16 info: Status update for node
einstein.prueba.uy: status active
heartbeat[8445]: 2008/06/10_11:44:16 info: No pkts missing from
einstein.prueba.uy!
heartbeat[8445]: 2008/06/10_11:44:16 info: remote resource transition
completed.
heartbeat[8445]: 2008/06/10_11:44:16 ERROR: Both machines own our resources!
heartbeat[8445]: 2008/06/10_11:44:16 ERROR: Both machines own foreign
resources!
heartbeat[8445]: 2008/06/10_11:44:16 info: maximatt.prueba.uy wants to go
standby [foreign]
heartbeat[8445]: 2008/06/10_11:44:16 ERROR: Both machines own our resources!
heartbeat[8445]: 2008/06/10_11:44:16 ERROR: Both machines own foreign
resources!
ipfail[8477]: 2008/06/10_11:44:16 info: Status update: Node
einstein.prueba.uy now has status active
harc[8933]: 2008/06/10_11:44:17 info: Running /etc/ha.d/rc.d/status
status
heartbeat[8445]: 2008/06/10_11:44:18 ERROR: Both machines own our resources!
heartbeat[8445]: 2008/06/10_11:44:18 ERROR: Both machines own foreign
resources!
heartbeat[8445]: 2008/06/10_11:44:19 WARN: Message hist queue is filling up
(376 messages in queue)
heartbeat[8445]: 2008/06/10_11:44:20 WARN: Message hist queue is filling up
(377 messages in queue)
heartbeat[8445]: 2008/06/10_11:44:21 WARN: Message hist queue is filling up
(378 messages in queue)
heartbeat[8445]: 2008/06/10_11:44:22 WARN: Message hist queue is filling up
(379 messages in queue)
heartbeat[8445]: 2008/06/10_11:44:23 WARN: Message hist queue is filling up
(380 messages in queue)
heartbeat[8445]: 2008/06/10_11:44:24 WARN: Message hist queue is filling up
(381 messages in queue)
heartbeat[8445]: 2008/06/10_11:44:25 WARN: Message hist queue is filling up
(382 messages in queue)
heartbeat[8445]: 2008/06/10_11:44:26 WARN: Message hist queue is filling up
(383 messages in queue)
heartbeat[8445]: 2008/06/10_11:44:26 WARN: No reply to standby request.
Standby request cancelled.
heartbeat[8445]: 2008/06/10_11:44:27 ERROR: Both machines own our resources!
heartbeat[8445]: 2008/06/10_11:44:27 ERROR: Both machines own foreign
resources!
these are the log for backup node:
heartbeat[6105]: 2008/06/10_11:53:58 info: Version 2 support: false
heartbeat[6105]: 2008/06/10_11:53:58 WARN: Logging daemon is disabled
--enabling logging daemon is recommended
heartbeat[6105]: 2008/06/10_11:53:58 info: **************************
heartbeat[6105]: 2008/06/10_11:53:58 info: Configuration validated. Starting
heartbeat 2.1.3
heartbeat[6106]: 2008/06/10_11:53:58 info: heartbeat: version 2.1.3
heartbeat[6106]: 2008/06/10_11:53:58 info: Heartbeat generation: 1207843592
heartbeat[6106]: 2008/06/10_11:53:58 info: glib: UDP Broadcast heartbeat
started on port 694 (694) interface eth0
heartbeat[6106]: 2008/06/10_11:53:58 info: glib: UDP Broadcast heartbeat
closed on port 694 interface eth0 - Status: 1
heartbeat[6106]: 2008/06/10_11:53:58 info: G_main_add_TriggerHandler: Added
signal manual handler
heartbeat[6106]: 2008/06/10_11:53:58 info: G_main_add_TriggerHandler: Added
signal manual handler
heartbeat[6106]: 2008/06/10_11:53:58 info: G_main_add_SignalHandler: Added
signal handler for signal 17
heartbeat[6106]: 2008/06/10_11:53:58 info: Local status now set to: 'up'
*heartbeat[6106]: 2008/06/10_11:55:59 WARN: node maximatt.prueba.uy: is dead
*
heartbeat[6106]: 2008/06/10_11:55:59 info: Comm_now_up(): updating status to
active
heartbeat[6106]: 2008/06/10_11:55:59 info: Local status now set to: 'active'
heartbeat[6106]: 2008/06/10_11:55:59 info: Starting child client
"/usr/lib/heartbeat/ipfail" (498,496)
heartbeat[6106]: 2008/06/10_11:55:59 WARN: No STONITH device configured.
heartbeat[6106]: 2008/06/10_11:55:59 WARN: Shared disks are not protected.
heartbeat[6106]: 2008/06/10_11:55:59 info: Resources being acquired from
maximatt.prueba.uy.
heartbeat[6116]: 2008/06/10_11:55:59 info: Starting
"/usr/lib/heartbeat/ipfail" as uid 498 gid 496 (pid 6116)
heartbeat[6118]: 2008/06/10_11:55:59 info: No local resources
[/usr/share/heartbeat/ResourceManager listkeys einstein.prueba.uy] to
acquire.
harc[6117]: 2008/06/10_11:55:59 info: Running /etc/ha.d/rc.d/status
status
mach_down[6146]: 2008/06/10_11:55:59 info: Taking over resource group
100.0.4.100
ResourceManager[6172]: 2008/06/10_11:55:59 info: Acquiring resource group:
maximatt.prueba.uy 100.0.4.100 httpd
IPaddr[6199]: 2008/06/10_11:55:59 INFO: Resource is stopped
ResourceManager[6172]: 2008/06/10_11:55:59 info: Running
/etc/ha.d/resource.d/IPaddr 100.0.4.100 start
IPaddr[6275]: 2008/06/10_11:55:59 INFO: Using calculated nic for
100.0.4.100: eth1
IPaddr[6275]: 2008/06/10_11:55:59 INFO: Using calculated netmask for
100.0.4.100: 255.0.0.0
IPaddr[6275]: 2008/06/10_11:55:59 INFO: eval ifconfig eth1:0
100.0.4.100netmask
255.0.0.0 broadcast 100.255.255.255
IPaddr[6258]: 2008/06/10_11:55:59 INFO: Success
ResourceManager[6172]: 2008/06/10_11:55:59 info: Running /etc/init.d/httpd
start
mach_down[6146]: 2008/06/10_11:56:01 info:
/usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired
mach_down[6146]: 2008/06/10_11:56:01 info: mach_down takeover
complete for node maximatt.prueba.uy.
heartbeat[6106]: 2008/06/10_11:56:01 info: mach_down takeover complete.
heartbeat[6106]: 2008/06/10_11:56:01 info: Initial resource acquisition
complete (mach_down)
heartbeat[6106]: 2008/06/10_11:56:09 info: Local Resource acquisition
completed. (none)
heartbeat[6106]: 2008/06/10_11:56:09 info: local resource transition
completed.
these are my configurations files:
ha.cf (master node):
debugfile /var/log/ha-debug
logfile /var/log/ha-log
logfacility local0
keepalive 1
deadtime 30
initdead 120
udpport 694
bcast dev20603
auto_failback on
node maximatt.prueba.uy
node einstein.prueba.uy
respawn hacluster /usr/lib/heartbeat/ipfail
apiauth ipfail gid=haclient uid=hacluster
ha.cf (backup node):
debugfile /var/log/ha-debug
logfile /var/log/ha-log
logfacility local0
keepalive 1
deadtime 30
initdead 120
udpport 694
bcast eth0
auto_failback on
node maximatt.prueba.uy
node einstein.prueba.uy
respawn hacluster /usr/lib/heartbeat/ipfail
apiauth ipfail gid=haclient uid=hacluster
haresources:
maximatt.prueba.uy 100.0.4.100 httpd
i cant understant why einstein see maximatt as dead (but not maximatt to
einstein)... i test connection and are ok :(
¿what's could be happend?
¿any suggestion? (i have few days with this issue)
thanks in advance!!!
Salu2 ;)
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems