On Jan 17, 2008, at 6:55 PM, Serge Dubrouski wrote:
On Jan 17, 2008 10:40 AM, Serge Dubrouski <[EMAIL PROTECTED]> wrote:
On Jan 17, 2008 10:14 AM, Andrew Beekhof <[EMAIL PROTECTED]> wrote:
On Jan 17, 2008, at 5:07 PM, Serge Dubrouski wrote:
I've got it starting all right,. now it complains on permissions:
Jan 17 11:00:16 fc-node1 crmd: [32197]: ERROR:
socket_wait_conn_new:
unlink failure(/var/run/heartbeat/crm/crmd): Permission denied
Jan 17 11:00:16 fc-node1 cib: [32196]: ERROR: Could not open config
file /var/lib/heartbeat/crm/cib.xml.last for reading: Permission
denied
Jan 17 11:00:16 fc-node1 crmd: [32197]: ERROR:
socket_wait_conn_new:
trying to create in /var/run/heartbeat/crm/crmd bind:: Address
already
in use
already in use?
you dont have heartbeat running too do you?
...................
All those files belong to hacluser:hacluster Do they need to
belong to
the other user?
assuming you're using the packages from the build service (and that
hacluser is missing a 't'), that should be right.
maybe delete /var/run/heartbeat/crm/crmd and see what perms it gets
recreated with?
Looks like you built Fedora packages for particular UID or so:
Jan 17 12:27:21 fc-node1 crmd: [323]: info: crmd_init: Starting crmd
Jan 17 12:27:21 fc-node1 attrd: [324]: ERROR: Cannot get name for uid
[24]: Success
Jan 17 12:27:21 fc-node1 cib: [322]: ERROR: Cannot get name for uid
[24]: Success
Then:
Jan 17 12:30:32 fc-node1 cib: [773]: info: retrieveCib: Reading
cluster configuration from: /var/lib/heartbeat/crm/cib.xml (digest:
/var/lib/heartbeat/crm/cib.xml.sig)
Jan 17 12:30:32 fc-node1 cib: [773]: ERROR: Could not open config
file
/var/lib/heartbeat/crm/cib.xml for reading: Permission denied
Jan 17 12:30:32 fc-node1 cib: [773]: ERROR: retrieveCib:
/var/lib/heartbeat/crm/cib.xml exists but does NOT contain valid XML.
But:
# ls -l /var/lib/heartbeat/crm/cib.xml
-rw------- 1 hacluster hacluster 3158 Jan 10 16:08
/var/lib/heartbeat/crm/cib.xml
And crmd doesn't get created with the same error: permissions denied.
Changind uid for hacluster from 501 to 24 fixed the problem.
BTW: Stopping openais service leaves lrmd up:
[EMAIL PROTECTED] log]# service openais stop
Stopping OpenAIS daemon (aisexec): [ OK ]
[EMAIL PROTECTED] log]# ps -ef | grep heart
root 3444 1 0 12:37 pts/0 00:00:00 /usr/lib/heartbeat/
lrmd
root 3483 32732 0 12:39 pts/0 00:00:00 grep heart
Is it supposed to be like that?
--
Serge Dubrouski.
And some more problems:
Jan 17 12:43:27 fc-node2 lrmd: [10530]: ERROR: on_msg_add_rsc: RA
class [stonith] does not exist.
Jan 17 12:43:27 fc-node2 crmd: [10532]: ERROR: lrm_add_rsc(726): got a
return code HA_FAIL from a reply message of addrsc with function
get_ret_from_msg.
Jan 17 12:43:27 fc-node2 crmd: [10532]: ERROR: get_lrm_resource: Could
not add resource child_DoFencing:0 to LRM
Jan 17 12:43:27 fc-node2 crmd: [10532]: ERROR: do_lrm_invoke: Invalid
resource definition
Not so much a problem as something thats not implemented yet.
stonithd relies on the heartbeat comms layer and thus wont work with
OpenAIS.
plus the configuration is hell and there are a number of design/
implementation issues.
we're going to get together with the Red Hat guys to figure out what
we're going to do for stonith in the new stack.
there will likely be a short-term solution next month.
_______________________________________________
Pacemaker mailing list
Pacemaker@clusterlabs.org
http://list.clusterlabs.org/mailman/listinfo/pacemaker