On Mon, 2010-03-22 at 17:50 +0100, Andreas Mock wrote: > Hi all, > > I'm using corosync 1.2 together with pacemaker 1.0.8 as found at > clusterlabs.org > on openSuSE 11.2. > > Now I have a situation where /etc/init.d/corosync status replies with a > running instance > of corosync. > Log shows that the child services have stopped and "detached" but ps axf shows > the following: > > 7211 ? Ssl 98:04 corosync > 7219 ? Zs 0:00 \_ [stonithd] <defunct> > 7220 ? Z 0:00 \_ [cib] <defunct> > 7222 ? Z 0:00 \_ [attrd] <defunct> > 7224 ? Z 0:00 \_ [crmd] <defunct> > >
My first guess is your shutting down pacemaker and running into a rare deadlock that happens only during shutdown of corosync. Your in luck though, because it is fixed in whitetank and pending release. you can verify that if you would like by doing the following: install corosync-debuginfo package: gdb attach 7211 thread apply all bt I can tell by the backtraces if your hitting this problem. Regards -steve > Uaaah, how can this be? > corosync is logging nothing anymore. See the last log entries > http://paste.org/pastebin/view/16608 > > Any hints? > > Best regards > Andreas Mock > _______________________________________________ > Openais mailing list > [email protected] > https://lists.linux-foundation.org/mailman/listinfo/openais _______________________________________________ Openais mailing list [email protected] https://lists.linux-foundation.org/mailman/listinfo/openais
