Thanks for answering, 

--- On Wed, 3/4/09, Andrew Beekhof <[email protected]> wrote:

> 
> crm_mon takes other things into account.
> but without logs or the current cib its impossible to say
> for sure why
> this is happening.


after a reboot, or restart the following log information are found in ha-debug

http://pastebin.com/m7d9c71f7

note the only error is :

mgmtd[5612]: 2009/03/04_16:58:25 ERROR: socket_client_channel_new: 
open(/var/lib/heartbeat/run/heartbeat/lrm_cmd_sock, ...) failure: No such file 
or directory

but it exists - its probably a race condition and created later:

ls -la /var/lib/heartbeat/run/heartbeat/lrm_cmd_sock
prwxrwxrwx   1 root     root           0 Mar  4 16:58 
/var/lib/heartbeat/run/heartbeat/lrm_cmd_sock|

At this point, cibadmin etc will not work and hang because they cant seem to 
connect to the crmd, crm_mon will indicate the note as offline

After killing crmd the following log information is found:

http://pastebin.com/m29a3ec9d

crmd[5644]: 2009/03/04_17:06:29 info: do_cib_control: CIB connection established

etc

So it seems that on the initial start crmd does not correctly initialize, maybe 
the cib process has to be started before crmd?

Maybe its related to the issue that under solaris sparc PIPES are used instead 
of sockets for communication

PIPES were introduced because of this patch

http://www.mail-archive.com/[email protected]/msg00307.html

since i have solaris 10 i tried to use streams but i dont find the ucred.h 
anywere for solaris.

Any ideas? How can i modify the "Starting child client" in different order?

Thanks


      
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to