On Jan 17, 2008, at 6:55 PM, Serge Dubrouski wrote:

On Jan 17, 2008 10:40 AM, Serge Dubrouski <[EMAIL PROTECTED]> wrote:

On Jan 17, 2008 10:14 AM, Andrew Beekhof <[EMAIL PROTECTED]> wrote:

On Jan 17, 2008, at 5:07 PM, Serge Dubrouski wrote:

I've got it starting all right,. now it complains on permissions:

Jan 17 11:00:16 fc-node1 crmd: [32197]: ERROR: socket_wait_conn_new:
unlink failure(/var/run/heartbeat/crm/crmd): Permission denied
Jan 17 11:00:16 fc-node1 cib: [32196]: ERROR: Could not open config
file /var/lib/heartbeat/crm/cib.xml.last for reading: Permission
denied
Jan 17 11:00:16 fc-node1 crmd: [32197]: ERROR: socket_wait_conn_new: trying to create in /var/run/heartbeat/crm/crmd bind:: Address already
in use

already in use?
you dont have heartbeat running too do you?



...................

All those files belong to hacluser:hacluster Do they need to belong to
the other user?

assuming you're using the packages from the build service (and that
hacluser is missing a 't'), that should be right.

maybe delete /var/run/heartbeat/crm/crmd and see what perms it gets
recreated with?


Looks like you built Fedora packages for particular UID or so:

Jan 17 12:27:21 fc-node1 crmd: [323]: info: crmd_init: Starting crmd
Jan 17 12:27:21 fc-node1 attrd: [324]: ERROR: Cannot get name for uid
[24]: Success
Jan 17 12:27:21 fc-node1 cib: [322]: ERROR: Cannot get name for uid
[24]: Success

Then:

Jan 17 12:30:32 fc-node1 cib: [773]: info: retrieveCib: Reading
cluster configuration from: /var/lib/heartbeat/crm/cib.xml (digest:
/var/lib/heartbeat/crm/cib.xml.sig)
Jan 17 12:30:32 fc-node1 cib: [773]: ERROR: Could not open config file
/var/lib/heartbeat/crm/cib.xml for reading: Permission denied
Jan 17 12:30:32 fc-node1 cib: [773]: ERROR: retrieveCib:
/var/lib/heartbeat/crm/cib.xml exists but does NOT contain valid XML.


But:

# ls -l /var/lib/heartbeat/crm/cib.xml
-rw------- 1 hacluster hacluster 3158 Jan 10 16:08
/var/lib/heartbeat/crm/cib.xml

And crmd doesn't get created with the same error: permissions denied.

Changind uid for hacluster from 501 to 24 fixed the problem.

BTW: Stopping openais service leaves lrmd up:

[EMAIL PROTECTED] log]# service openais stop
Stopping OpenAIS daemon (aisexec):                         [  OK  ]
[EMAIL PROTECTED] log]# ps -ef | grep heart
root 3444 1 0 12:37 pts/0 00:00:00 /usr/lib/heartbeat/ lrmd
root      3483 32732  0 12:39 pts/0    00:00:00 grep heart

Is it supposed to be like that?
--
Serge Dubrouski.


And some more problems:

Jan 17 12:43:27 fc-node2 lrmd: [10530]: ERROR: on_msg_add_rsc: RA
class [stonith] does not exist.
Jan 17 12:43:27 fc-node2 crmd: [10532]: ERROR: lrm_add_rsc(726): got a
return code HA_FAIL from a reply message of addrsc with function
get_ret_from_msg.
Jan 17 12:43:27 fc-node2 crmd: [10532]: ERROR: get_lrm_resource: Could
not add resource child_DoFencing:0 to LRM
Jan 17 12:43:27 fc-node2 crmd: [10532]: ERROR: do_lrm_invoke: Invalid
resource definition

Not so much a problem as something thats not implemented yet.

stonithd relies on the heartbeat comms layer and thus wont work with OpenAIS. plus the configuration is hell and there are a number of design/ implementation issues.

we're going to get together with the Red Hat guys to figure out what we're going to do for stonith in the new stack.
there will likely be a short-term solution next month.

_______________________________________________
Pacemaker mailing list
Pacemaker@clusterlabs.org
http://list.clusterlabs.org/mailman/listinfo/pacemaker

Reply via email to