Hi

I started to use heartbeat this morning. I am new to linux-ha project.
So don't blame me if my questions are so simple.

Here is my first question:
Are there heartbeat and cluster-glue packages enough to test a simple
HA scenario?

If the answer is yes, I should tell that I compiled cluster glue and
heartbeat successfully and I tried to test a simple scenario (just
setting IP or httpd). But it did not work as I expected. It would be
great for me if somebody gave me a hint.


sg168:

authkeys:
auth 1
1 sha1 myheartbeat

ha.cf:
logfacility local0
auto_failback on
logfile /var/log/ha-log
debugfile /var/log/ha-debug
debug 1
keepalive 2
deadtime 15
warntime 10
initdead 120
udpport 694
#bcast eth3
ucast eth3 192.168.50.17
node sg168 # in both nodes command #uname -n should
node sg169 # give the these hostnames
pacemaker off

haresources:
sg168 IPaddr::192.168.20.222/24/eth0

debug:
Dec 30 05:58:37 sg168 heartbeat: [7095]: info: Configuration
validated. Starting heartbeat 3.0.5
Dec 30 05:58:37 sg168 heartbeat: [7095]: debug: HA configuration OK.
Heartbeat starting.
Dec 30 05:58:37 sg168 heartbeat: [7095]: info: Heartbeat Hg Version:
node: 7e3a82377fa8c88b4d9ee47e29020d4531f4629a
Dec 30 05:58:37 sg168 heartbeat: [7096]: info: heartbeat: version 3.0.5
Dec 30 05:58:38 sg168 heartbeat: [7096]: info: Heartbeat generation: 1356801148
Dec 30 05:58:38 sg168 heartbeat: [7096]: debug: uuid
is:c885d691-407e-4fb1-8214-d28deb39470d
Dec 30 05:58:38 sg168 heartbeat: [7096]: debug: FIFO process pid: 7099
Dec 30 05:58:38 sg168 heartbeat: [7096]: debug: opening ucast eth3
(UDP/IP unicast)
Dec 30 05:58:38 sg168 heartbeat: [7096]: info: glib: ucast: write
socket priority set to IPTOS_LOWDELAY on eth3
Dec 30 05:58:38 sg168 heartbeat: [7096]: info: glib: ucast: bound send
socket to device: eth3
Dec 30 05:58:38 sg168 heartbeat: [7096]: info: glib: ucast: bound
receive socket to device: eth3
Dec 30 05:58:38 sg168 heartbeat: [7096]: info: glib: ucast: started on
port 694 interface eth3 to 192.168.50.17
Dec 30 05:58:38 sg168 heartbeat: [7096]: debug: write process pid: 7100
Dec 30 05:58:38 sg168 heartbeat: [7096]: debug: read child process pid: 7101
Dec 30 05:58:38 sg168 heartbeat: [7096]: debug: make_io_childpair:
CREATED childpair wchan socket 11
Dec 30 05:58:38 sg168 heartbeat: [7096]: debug: make_io_childpair:
CREATED childpair rchan socket 13
Dec 30 05:58:38 sg168 heartbeat: [7096]: debug: Limiting CPU: 42 CPU
seconds every 60000 milliseconds
Dec 30 05:58:38 sg168 heartbeat: [7099]: debug: pid 7099 locked in memory.
Dec 30 05:58:38 sg168 heartbeat: [7099]: debug: Limiting CPU: 6 CPU
seconds every 59999 milliseconds
Dec 30 05:58:38 sg168 heartbeat: [7101]: debug: pid 7101 locked in memory.
Dec 30 05:58:38 sg168 heartbeat: [7101]: debug: Limiting CPU: 6 CPU
seconds every 59999 milliseconds
Dec 30 05:58:38 sg168 heartbeat: [7100]: debug: pid 7100 locked in memory.
Dec 30 05:58:38 sg168 heartbeat: [7100]: debug: Limiting CPU: 24 CPU
seconds every 59999 milliseconds
Dec 30 05:58:38 sg168 heartbeat: [7096]: debug: pid 7096 locked in memory.
Dec 30 05:58:38 sg168 heartbeat: [7096]: debug: Waiting for child
processes to start
Dec 30 05:58:38 sg168 heartbeat: [7096]: info: Local status now set to: 'up'
Dec 30 05:58:38 sg168 heartbeat: [7096]: debug: All your child process
are belong to us
Dec 30 05:58:38 sg168 heartbeat: [7096]: debug: Starting local status
message @ 2000 ms intervals
Dec 30 05:58:38 sg168 heartbeat: [7096]: debug: Forking temp process
write_hostcachedata
Dec 30 05:58:38 sg168 heartbeat: [7096]: info: Managed
write_hostcachedata process 7102 exited with return code 0.
Dec 30 05:58:48 sg168 heartbeat: [7096]: info: Link sg169:eth3 up.
Dec 30 05:58:48 sg168 heartbeat: [7096]: debug: CreateInitialFilter: ip-request
Dec 30 05:58:48 sg168 heartbeat: [7096]: debug: CreateInitialFilter:
ask_resources
Dec 30 05:58:48 sg168 heartbeat: [7096]: debug: CreateInitialFilter: status
Dec 30 05:58:48 sg168 heartbeat: [7096]: debug: CreateInitialFilter:
ip-request-resp
Dec 30 05:58:48 sg168 heartbeat: [7096]: debug: CreateInitialFilter: hb_takeover
Dec 30 05:58:48 sg168 heartbeat: [7096]: debug: sending reqnodes msg
to node sg169
Dec 30 05:58:48 sg168 heartbeat: [7096]: info: Status update for node
sg169: status up
Dec 30 05:58:48 sg168 heartbeat: [7096]: debug: Status seqno: 2
msgtime: 1356834534
Dec 30 05:58:48 sg168 heartbeat: [7096]: debug:
StartNextRemoteRscReq() - calling hook
Dec 30 05:58:48 sg168 heartbeat: [7096]: debug: notify_world: invoking
harc: OLD status: up
Dec 30 05:58:48 sg168 heartbeat: [7096]: debug: Process [status]
started pid 7103
Dec 30 05:58:48 sg168 heartbeat: [7096]: debug: Starting notify process [status]
Dec 30 05:58:48 sg168 heartbeat: [7096]: debug: Forking temp process
write_hostcachedata
Dec 30 05:58:48 sg168 heartbeat: [7103]: debug: notify_world: setting
SIGCHLD Handler to SIG_DFL
Dec 30 05:58:48 sg168 heartbeat: [7103]: debug: notify_world: Running
harc status
Dec 30 05:58:48 sg168 heartbeat: [7096]: info: Managed
write_hostcachedata process 7104 exited with return code 0.
Dec 30 05:58:48 sg168 heartbeat: [7096]: WARN: Managed status process
7103 exited with return code 1.
Dec 30 05:58:48 sg168 heartbeat: [7096]: debug: RscMgmtProc 'status'
exited code 1
Dec 30 05:58:49 sg168 heartbeat: [7096]: debug: Get a repnodes msg from sg169
Dec 30 05:58:49 sg168 heartbeat: [7096]: debug: nodelist received:sg168 sg169
Dec 30 05:58:49 sg168 heartbeat: [7096]: info: Comm_now_up(): updating
status to active
Dec 30 05:58:49 sg168 heartbeat: [7096]: info: Local status now set to: 'active'
Dec 30 05:58:49 sg168 heartbeat: [7096]: debug: Sending local starting
msg: resourcestate = 0
Dec 30 05:58:49 sg168 heartbeat: [7096]: debug: hb_rsc_isstable:
ResourceMgmt_child_count: 0, other_is_stable: 0, takeover_in_progress:
0, going_standby: 0, standby running(ms): 0, resourcestate: 0
Dec 30 05:58:49 sg168 heartbeat: [7096]: debug: Get a reqnodes message
from sg169
Dec 30 05:58:49 sg168 heartbeat: [7096]: debug: get_delnodelist: delnodelist=
Dec 30 05:58:49 sg168 heartbeat: [7096]: debug: Forking temp process
write_hostcachedata
Dec 30 05:58:49 sg168 heartbeat: [7096]: debug: Forking temp process
write_delcachedata
Dec 30 05:58:49 sg168 heartbeat: [7096]: info: Managed
write_hostcachedata process 7110 exited with return code 0.
Dec 30 05:58:49 sg168 heartbeat: [7096]: info: Status update for node
sg169: status active
Dec 30 05:58:49 sg168 heartbeat: [7096]: debug: Status seqno: 6
msgtime: 1356834536
Dec 30 05:58:49 sg168 heartbeat: [7096]: debug:
StartNextRemoteRscReq() - calling hook
Dec 30 05:58:49 sg168 heartbeat: [7096]: debug: notify_world: invoking
harc: OLD status: active
Dec 30 05:58:49 sg168 heartbeat: [7096]: debug: Process [status]
started pid 7112
Dec 30 05:58:49 sg168 heartbeat: [7096]: debug: Starting notify process [status]
Dec 30 05:58:49 sg168 heartbeat: [7096]: info: AnnounceTakeover(local
0, foreign 1, reason 'HB_R_BOTHSTARTING' (0))
Dec 30 05:58:49 sg168 heartbeat: [7096]: debug: process_resources:
other now unstable
Dec 30 05:58:49 sg168 heartbeat: [7096]: debug: Sending hold resources
msg: none, stable=0 # <none>
Dec 30 05:58:49 sg168 heartbeat: [7096]: info: STATE 1 => 3
Dec 30 05:58:49 sg168 heartbeat: [7096]: debug: hb_rsc_isstable:
ResourceMgmt_child_count: 1, other_is_stable: 0, takeover_in_progress:
0, going_standby: 0, standby running(ms): 0, resourcestate: 3
Dec 30 05:58:49 sg168 heartbeat: [7096]: info: STATE 3 => 2
Dec 30 05:58:49 sg168 heartbeat: [7096]: debug: hb_rsc_isstable:
ResourceMgmt_child_count: 1, other_is_stable: 0, takeover_in_progress:
0, going_standby: 0, standby running(ms): 0, resourcestate: 2
Dec 30 05:58:49 sg168 heartbeat: [7096]: info: Managed
write_delcachedata process 7111 exited with return code 0.
Dec 30 05:58:49 sg168 heartbeat: [7112]: debug: notify_world: setting
SIGCHLD Handler to SIG_DFL
Dec 30 05:58:49 sg168 heartbeat: [7112]: debug: notify_world: Running
harc status
Dec 30 05:58:49 sg168 heartbeat: [7096]: WARN: Managed status process
7112 exited with return code 1.
Dec 30 05:58:49 sg168 heartbeat: [7096]: debug: RscMgmtProc 'status'
exited code 1
Dec 30 05:58:59 sg168 heartbeat: [7096]: info: remote resource
transition completed.
Dec 30 05:58:59 sg168 heartbeat: [7096]: debug: Sending hold resources
msg: none, stable=0 # <none>
Dec 30 05:58:59 sg168 heartbeat: [7096]: info: STATE 2 => 3
Dec 30 05:58:59 sg168 heartbeat: [7096]: debug: hb_rsc_isstable:
ResourceMgmt_child_count: 0, other_is_stable: 1, takeover_in_progress:
0, going_standby: 0, standby running(ms): 0, resourcestate: 3
Dec 30 05:58:59 sg168 heartbeat: [7096]: debug: Calling PerformAutoFailback()
Dec 30 05:58:59 sg168 heartbeat: [7096]: info: other_holds_resources: 1
Dec 30 05:58:59 sg168 heartbeat: [7096]: info: remote resource
transition completed.
Dec 30 05:58:59 sg168 heartbeat: [7096]: debug: Process
[req_our_resources(ask)] started pid 7118
Dec 30 05:58:59 sg168 heartbeat: [7096]: debug: Sending hold resources
msg: local, stable=1 # <none>
Dec 30 05:58:59 sg168 heartbeat: [7096]: info: AnnounceTakeover(local
1, foreign 1, reason 'T_RESOURCES(us)' (0))
Dec 30 05:58:59 sg168 heartbeat: [7096]: info: Initial resource
acquisition complete (T_RESOURCES(us))
Dec 30 05:58:59 sg168 heartbeat: [7096]: debug: hb_rsc_isstable:
ResourceMgmt_child_count: 1, other_is_stable: 1, takeover_in_progress:
0, going_standby: 0, standby running(ms): 0, resourcestate: 3
Dec 30 05:58:59 sg168 heartbeat: [7096]: debug: Calling PerformAutoFailback()
Dec 30 05:58:59 sg168 heartbeat: [7096]: info: AnnounceTakeover(local
1, foreign 1, reason 'T_RESOURCES(them)' (1))
Dec 30 05:58:59 sg168 heartbeat: [7096]: info: STATE 3 => 4
Dec 30 05:58:59 sg168 heartbeat: [7096]: debug: hb_rsc_isstable:
ResourceMgmt_child_count: 1, other_is_stable: 1, takeover_in_progress:
0, going_standby: 0, standby running(ms): 0, resourcestate: 4
Dec 30 05:58:59 sg168 heartbeat: [7118]: debug:
req_our_resources(/usr/local/share/heartbeat/ResourceManager listkeys
sg168)
Dec 30 05:59:00 sg168 heartbeat: [7118]: ERROR:
pclose(/usr/local/share/heartbeat/ResourceManager listkeys sg168)
exited with return code 1
Dec 30 05:59:00 sg168 heartbeat: [7118]: ERROR:
[/usr/local/share/heartbeat/ResourceManager listkeys sg168] exited
with return code 1
Dec 30 05:59:00 sg168 heartbeat: [7118]: info: No local resources
[/usr/local/share/heartbeat/ResourceManager listkeys sg168] to
acquire.
Dec 30 05:59:00 sg168 heartbeat: [7118]: debug: Sending hold resources
msg: local, stable=1 # req_our_resources()
Dec 30 05:59:00 sg168 heartbeat: [7118]: info: FIFO message [type
resource] written rc=81
Dec 30 05:59:00 sg168 heartbeat: [7096]: info: AnnounceTakeover(local
1, foreign 1, reason 'T_RESOURCES(us)' (1))
Dec 30 05:59:00 sg168 heartbeat: [7096]: debug: hb_rsc_isstable:
ResourceMgmt_child_count: 1, other_is_stable: 1, takeover_in_progress:
0, going_standby: 0, standby running(ms): 0, resourcestate: 4
Dec 30 05:59:00 sg168 heartbeat: [7096]: info: other_holds_resources: 1
Dec 30 05:59:00 sg168 heartbeat: [7096]: debug: hb_rsc_isstable:
ResourceMgmt_child_count: 1, other_is_stable: 1, takeover_in_progress:
0, going_standby: 0, standby running(ms): 0, resourcestate: 4
Dec 30 05:59:00 sg168 heartbeat: [7096]: info: other_holds_resources: 1
Dec 30 05:59:00 sg168 heartbeat: [7096]: debug: hb_rsc_isstable:
ResourceMgmt_child_count: 1, other_is_stable: 1, takeover_in_progress:
0, going_standby: 0, standby running(ms): 0, resourcestate: 4
Dec 30 05:59:00 sg168 heartbeat: [7096]: info: Managed
req_our_resources(ask) process 7118 exited with return code 0.
Dec 30 05:59:00 sg168 heartbeat: [7096]: debug: RscMgmtProc
'req_our_resources(ask)' exited code 0
Dec 30 06:00:15 sg168 heartbeat: [7096]: WARN: Gmain_timeout_dispatch:
Dispatch function for send local status took too long to execute: 120
ms (> 50 ms) (GSource: 0x8a4c390)

the configuration on other system is like above.
I read the documentation but it didn't help.

Thank you so much
Ali
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to