I forgot to add something useful, appended below.
-- Cos
On Fri, Jul 30, 2010 at 05:02:04PM -0400, I wrote:
> I've gotten further, into another roadblock:
> Starting the cman service gets as far as successfully running ccsd,
> but then at the next step where it checks cman status, it fails.
>
> | Starting cluster:
> | Loading modules... done
> | Mounting configfs... done
> | Starting ccsd... done
> | Starting cman... failed
> |
> | [FAILED]
>
> Any attempt to use cman_tool gives:
> cman_tool: Cannot open connection to cman, is it running ?
>
> strace shows that it fails here:
> connect(3, {sa_family=AF_FILE, path="/var/run/cman_client"}, 110) = -1
> ECONNREFUSED (Connection refused)
>
> strace -fp on ccsd shows the same thing, in a loop, like this:
>
> [pid 7670] rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
> [pid 7670] rt_sigaction(SIGCHLD, NULL, {SIG_DFL, [], 0}, 8) = 0
> [pid 7670] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
> [pid 7670] nanosleep({1, 0}, {1, 0}) = 0
> [pid 7670] socket(PF_FILE, SOCK_STREAM, 0) = 9
> [pid 7670] fcntl(9, F_SETFD, FD_CLOEXEC) = 0
> [pid 7670] connect(9, {sa_family=AF_FILE, path="/var/run/cman_client"}, 110)
> = -1 ECONNREFUSED (Connection refused)
> [pid 7670] close(9) = 0
>
> Named sockets are there and seem to have the right permissions:
> srw------- 1 root root 0 Jul 30 15:21 /var/run/cman_admin=
> srw-rw---- 1 root root 0 Jul 30 15:21 /var/run/cman_client=
>
> I've been googling for ideas. I don't really understand what's
> supposed to using this named socket to communicate with what,
> but I think this is a way for clients to talk to the cman plugin
> in aisexec. aisexec seems okay on all my cluster nodes, and they're
> all communicating with each other, but I also don't know the right
> ways to poke at it to check its health.
>
> I'm continuing to experiment, and Google for ideas, but maybe
> someone here can point me in the right direction.
> -- Cos
I noticed that lsof on my aisexec daemons on this cluster shows that
they do *not* have /var/run/cman_client or /var/run/cman_admin open.
lsof shows nothing has those open. When I checked on a working
cluster I saw that aisexec does have both of those open.
I tried rm'ing both and letting them be recreated when I restarted the
aisexec and cman services, but I still get ECONNREFUSED on cman_client
when I try running any cman_tool command.
-- Cos
_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais