Never mind. I had to add this to /etc/corosync/service.d/pcmk:
service {
name: pacemaker
ver: 1
}
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Yount, William D
Sent: Wednesday, August 08, 2012 5:39 PM
To: General Linux-HA mailing list
Subject: Re: [Linux-HA] Can't start pacemaker
It seems to be an issue with the Corosync API. Here is the output from
corosync.log:
Aug 08 17:18:45 [30364] KNTCLFS001 cib: info: startCib: CIB
Initialization completed successfully
Aug 08 17:18:45 [30364] KNTCLFS001 cib: info: get_cluster_type:
Cluster type is: 'corosync'
Aug 08 17:18:45 [30364] KNTCLFS001 cib: notice: crm_cluster_connect:
Connecting to cluster infrastructure: corosync
Aug 08 17:18:45 [30364] KNTCLFS001 cib: info:
init_ais_connection_once: Connection to 'corosync': established
Aug 08 17:18:45 [30364] KNTCLFS001 cib: info: crm_new_peer: Node
KNTCLFS001 now has id: 83994816
Aug 08 17:18:45 [30364] KNTCLFS001 cib: info: crm_new_peer: Node
83994816 is now known as KNTCLFS001
Aug 08 17:18:45 [30364] KNTCLFS001 cib: info: cib_init:
Starting cib mainloop
Aug 08 17:18:45 [30364] KNTCLFS001 cib: info: set_crm_log_level:
New log level: 3 0
Aug 08 17:18:46 [30369] KNTCLFS001 crmd: info: do_cib_control:
CIB connection established
Aug 08 17:18:46 [30369] KNTCLFS001 crmd: info: get_cluster_type:
Cluster type is: 'corosync'
Aug 08 17:18:46 [30369] KNTCLFS001 crmd: notice: crm_cluster_connect:
Connecting to cluster infrastructure: corosync
Aug 08 17:18:46 [30365] KNTCLFS001 stonith-ng: notice: setup_cib:
Watching for stonith topology changes
Aug 08 17:18:46 [30365] KNTCLFS001 stonith-ng: info: main: Starting
stonith-ng mainloop
Aug 08 17:18:46 [30369] KNTCLFS001 crmd: info:
init_ais_connection_once: Connection to 'corosync': established
Aug 08 17:18:46 [30369] KNTCLFS001 crmd: info: crm_new_peer: Node
KNTCLFS001 now has id: 83994816
Aug 08 17:18:46 [30369] KNTCLFS001 crmd: info: crm_new_peer: Node
83994816 is now known as KNTCLFS001
Aug 08 17:18:46 [30369] KNTCLFS001 crmd: info: ais_status_callback:
status: KNTCLFS001 is now unknown
Aug 08 17:18:46 [30369] KNTCLFS001 crmd: error:
init_quorum_connection: The Corosync quorum API is not supported in this
build
Aug 08 17:18:46 [30360] KNTCLFS001 pacemakerd: error: pcmk_child_exit:
Child process crmd exited (pid=30369, rc=100)
Aug 08 17:18:46 [30360] KNTCLFS001 pacemakerd: warning: pcmk_child_exit:
Pacemaker child process crmd no longer wishes to be respawned. Shutting
ourselves down.
Aug 08 17:18:46 [30360] KNTCLFS001 pacemakerd: warning: send_ipc_message:
IPC Channel to 30369 is not connected
Aug 08 17:18:46 [30360] KNTCLFS001 pacemakerd: notice: pcmk_shutdown_worker:
Shuting down Pacemaker
Aug 08 17:18:46 [30360] KNTCLFS001 pacemakerd: notice: stop_child:
Stopping pengine: Sent -15 to process 30368
Aug 08 17:18:46 [30360] KNTCLFS001 pacemakerd: info: pcmk_child_exit:
Child process pengine exited (pid=30368, rc=0)
Aug 08 17:18:46 [30360] KNTCLFS001 pacemakerd: notice: stop_child:
Stopping attrd: Sent -15 to process 30367
Aug 08 17:18:46 [30367] KNTCLFS001 attrd: notice: main: Exiting...
Aug 08 17:18:46 [30360] KNTCLFS001 pacemakerd: info: pcmk_child_exit:
Child process attrd exited (pid=30367, rc=0)
Aug 08 17:18:46 [30360] KNTCLFS001 pacemakerd: warning: send_ipc_message:
IPC Channel to 30367 is not connected
Aug 08 17:18:46 [30360] KNTCLFS001 pacemakerd: notice: stop_child:
Stopping lrmd: Sent -15 to process 30366
Aug 08 17:18:46 KNTCLFS001 lrmd: [30366]: info: lrmd is shutting down
Aug 08 17:18:46 [30360] KNTCLFS001 pacemakerd: info: pcmk_child_exit:
Child process lrmd exited (pid=30366, rc=0)
Aug 08 17:18:46 [30360] KNTCLFS001 pacemakerd: notice: stop_child:
Stopping stonith-ng: Sent -15 to process 30365
Aug 08 17:18:46 [30365] KNTCLFS001 stonith-ng: info: crm_signal_dispatch:
Invoking handler for signal 15: Terminated
Aug 08 17:18:46 [30365] KNTCLFS001 stonith-ng: info: stonith_shutdown:
Terminating with 0 clients
Aug 08 17:18:46 [30360] KNTCLFS001 pacemakerd: info: pcmk_child_exit:
Child process stonith-ng exited (pid=30365, rc=0)
Aug 08 17:18:46 [30360] KNTCLFS001 pacemakerd: notice: stop_child:
Stopping cib: Sent -15 to process 30364
Aug 08 17:18:46 [30364] KNTCLFS001 cib: info: crm_signal_dispatch:
Invoking handler for signal 15: Terminated
Aug 08 17:18:46 [30364] KNTCLFS001 cib: info: cib_shutdown:
Disconnected 0 clients
Aug 08 17:18:46 [30364] KNTCLFS001 cib: info:
cib_process_disconnect: All clients disconnected...
Aug 08 17:18:46 [30364] KNTCLFS001 cib: info:
cib_ha_connection_destroy: Heartbeat disconnection complete... exiting
Aug 08 17:18:46 [30364] KNTCLFS001 cib: info:
cib_ha_connection_destroy: Exiting...
Aug 08 17:18:46 [30364] KNTCLFS001 cib: info: crm_xml_cleanup:
Cleaning up memory from libxml2
Aug 08 17:18:46 [30364] KNTCLFS001 cib: info: main: Done
Aug 08 17:18:46 [30360] KNTCLFS001 pacemakerd: info: pcmk_child_exit:
Child process cib exited (pid=30364, rc=0)
Aug 08 17:18:46 [30360] KNTCLFS001 pacemakerd: notice: pcmk_shutdown_worker:
Shutdown complete
Aug 08 17:18:46 [30360] KNTCLFS001 pacemakerd: notice: pcmk_shutdown_worker:
Attempting to inhibit respawning after fatal error
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Yount, William D
Sent: Wednesday, August 08, 2012 2:52 AM
To: [email protected]
Subject: [Linux-HA] Can't start pacemaker
I am following with the "Clusters from Scratch" guide to setup a cluster on two
CentOS 6.3 boxes. I am at the part where corosync is started and working
correctly on both nodes. When I try to start pacemaker on either node, it keeps
failing. Here is the output from strace:
stat("/etc/init.d/pacemaker", {st_mode=S_IFREG|0755, st_size=2543, ...}) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 stat(".", {st_mode=S_IFDIR|0555,
st_size=4096, ...}) = 0
stat("/sbin/env", 0x7fff9f8d6bb0) = -1 ENOENT (No such file or directory)
stat("/usr/sbin/env", 0x7fff9f8d6bb0) = -1 ENOENT (No such file or directory)
stat("/bin/env", {st_mode=S_IFREG|0755, st_size=23832, ...}) = 0
stat("/bin/env", {st_mode=S_IFREG|0755, st_size=23832, ...}) = 0
geteuid() = 0
getegid() = 0
getuid() = 0
getgid() = 0
access("/bin/env", X_OK) = 0
stat("/bin/env", {st_mode=S_IFREG|0755, st_size=23832, ...}) = 0
geteuid() = 0
getegid() = 0
getuid() = 0
getgid() = 0
access("/bin/env", R_OK) = 0
stat("/bin/env", {st_mode=S_IFREG|0755, st_size=23832, ...}) = 0
stat("/bin/env", {st_mode=S_IFREG|0755, st_size=23832, ...}) = 0
geteuid() = 0
getegid() = 0
getuid() = 0
getgid() = 0
access("/bin/env", X_OK) = 0
stat("/bin/env", {st_mode=S_IFREG|0755, st_size=23832, ...}) = 0
geteuid() = 0
getegid() = 0
getuid() = 0
getgid() = 0
access("/bin/env", R_OK) = 0
rt_sigprocmask(SIG_BLOCK, [INT CHLD], [], 8) = 0 rt_sigprocmask(SIG_BLOCK,
[CHLD], [INT CHLD], 8) = 0 rt_sigprocmask(SIG_SETMASK, [INT CHLD], NULL, 8) = 0
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
child_tidptr=0x7f48bced69d0) = 3949 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) =
0 rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0 rt_sigprocmask(SIG_SETMASK, [],
NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0 rt_sigaction(SIGINT,
{0x43d060, [], SA_RESTORER, 0x7f48bc53a920}, {SIG_DFL, [], SA_RESTORER,
0x7f48bc53a920}, 8) = 0
wait4(-1, Starting Pacemaker Cluster Manager: [FAILED]
[{WIFEXITED(s) && WEXITSTATUS(s) == 1}], 0, NULL) = 3949
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
--- SIGCHLD (Child exited) @ 0 (0) ---
wait4(-1, 0x7fff9f8d671c, WNOHANG, NULL) = -1 ECHILD (No child processes)
rt_sigreturn(0xffffffffffffffff) = 0
rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x7f48bc53a920}, {0x43d060, [],
SA_RESTORER, 0x7f48bc53a920}, 8) = 0 rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0
read(255, "", 1694) = 0
exit_group(1) = ?
I see the "no such file or directory" messages but I am not sure what impact
that has on the application. I have been noticing that corosync spikes up to
100% cpu usage; makes the entire system sluggish. Here are software versions:
centos-release-6-3.el6.centos.9.x86_64
corosynclib-1.4.1-7.el6.x86_64
corosync-1.4.1-7.el6.x86_64
pacemaker-cli-1.1.7-6.el6.x86_64
pacemaker-1.1.7-6.el6.x86_64
pacemaker-libs-1.1.7-6.el6.x86_64
pacemaker-cluster-libs-1.1.7-6.el6.x86_64
William
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems