Alain, We are aware of a newly discovered shutdown issue but don't yet have a root cause of the problem. We haven't been able to reproduce it on our equipment so as of yet we can't fix it.
If you could gather a backtrace of the corosync process during shutdown that might help. To do that, first install corosync-debuginfo package. Then: gdb attach (the pid of the corosync process) thread apply all bt send output to list Thanks -steve On Tue, 2010-05-04 at 16:06 +0200, Alain.Moulle wrote: > Yep, I've just updated all rpms with : > cluster-glue-1.0.5-1.el5.x86_64.rpm > corosynclib-1.2.1-1.el5.x86_64.rpm > resource-agents-1.0.3-2.el5.x86_64.rpm > cluster-glue-libs-1.0.5-1.el5.x86_64.rpm > pacemaker-1.0.8-6.el5.x86_64.rpm > corosync-1.2.1-1.el5.x86_64.rpm > pacemaker-libs-1.0.8-6.el5.x86_64.rpm > > I'll keep you informed on this thread if the problem occurs again. > > Thanks a lot > Alain > > > Andrew Beekhof a écrit : > > Alain, clusterlabs has 1.2.1 now. Could you try updating? > > > > On Tue, May 4, 2010 at 2:48 PM, Jan Friesse <[email protected]> wrote: > > > > > Hi, > > > 1.2.0 has some shutdown issues. Try to upgrade to 1.2.1 (1.2.2 when > > > released), and problem should dissapeared. > > > > > > Regards, > > > Honza > > > > > > > > > Alain.Moulle wrote: > > > > > > > Hi everybody, > > > > > > > > thanks for all your responses... for now, I did not get the stall > > > > again since this morning , it happens rarely but it happens enough > > > > often to be annoying. Notice that I have none > > > > resource configured. Except the both stonith resources (for a two nodes > > > > cluster) > > > > or 4 stonith resources (for a 4 nodes cluster). > > > > > > > > My corosync release is : > > > > corosync-1.2.0-1.el5 > > > > so on RHEL5, but I already have encountered the problem also of fc12. > > > > > > > > And yes, it is under Pacemaker, and my rpms are : > > > > pacemaker-1.0.8-2.el5 > > > > cluster-glue-1.0.3-1.el5 > > > > resource-agents-1.0.1-1.el5 > > > > and I also have (despite not useful for HA stack): > > > > openais-1.1.0-1.el5 > > > > > > > > Thanks > > > > Regards. > > > > Alain > > > > > > > > Jan Friesse a écrit : > > > > > > > > > Alain, > > > > > what version of corosync are you using? > > > > > > > > > > Are you using pacemaker? > > > > > > > > > > If you are using corosync 1.2.1 please try to send gdb bt of threads. > > > > > > > > > > Regards, > > > > > Honza > > > > > > > > > > Alain.Moulle wrote: > > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > When stopping corosync with /etc/init.d/corosync stop", I'm from > > > > > > time > > > > > > to time stalled > > > > > > during unload services : > > > > > > Signaling Corosync Cluster Engine (corosync) to terminate: [ OK ] > > > > > > Waiting for corosync services to > > > > > > unload:................................. > > > > > > > > > > > > What could be the reasons ? > > > > > > What could I do to avoid this ? > > > > > > What could I do to force the unload without rebooting the node ? > > > > > > > > > > > > Thanks for help. > > > > > > Alain Moullé > > > > > > _______________________________________________ > > > > > > Openais mailing list > > > > > > [email protected] > > > > > > https://lists.linux-foundation.org/mailman/listinfo/openais > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > Openais mailing list > > > [email protected] > > > https://lists.linux-foundation.org/mailman/listinfo/openais > > > > > > > > > > > > > > _______________________________________________ > Openais mailing list > [email protected] > https://lists.linux-foundation.org/mailman/listinfo/openais _______________________________________________ Openais mailing list [email protected] https://lists.linux-foundation.org/mailman/listinfo/openais
