Alain,

We are aware of a newly discovered shutdown issue but don't yet have a
root cause of the problem.  We haven't been able to reproduce it on our
equipment so as of yet we can't fix it.

If you could gather a backtrace of the corosync process during shutdown
that might help.

To do that, first install corosync-debuginfo package.

Then:

gdb 
attach (the pid of the corosync process)
thread apply all bt

send output to list

Thanks
-steve


On Tue, 2010-05-04 at 16:06 +0200, Alain.Moulle wrote:
> Yep, I've just updated all rpms with :
> cluster-glue-1.0.5-1.el5.x86_64.rpm
> corosynclib-1.2.1-1.el5.x86_64.rpm
> resource-agents-1.0.3-2.el5.x86_64.rpm
> cluster-glue-libs-1.0.5-1.el5.x86_64.rpm
> pacemaker-1.0.8-6.el5.x86_64.rpm
> corosync-1.2.1-1.el5.x86_64.rpm
> pacemaker-libs-1.0.8-6.el5.x86_64.rpm
> 
> I'll keep you informed on this thread if the problem occurs again.
> 
> Thanks a lot
> Alain 
> 
> 
> Andrew Beekhof a écrit : 
> > Alain, clusterlabs has 1.2.1 now.  Could you try updating?
> > 
> > On Tue, May 4, 2010 at 2:48 PM, Jan Friesse <[email protected]> wrote:
> >   
> > > Hi,
> > > 1.2.0 has some shutdown issues. Try to upgrade to 1.2.1 (1.2.2 when
> > > released), and problem should dissapeared.
> > > 
> > > Regards,
> > >  Honza
> > > 
> > > 
> > > Alain.Moulle wrote:
> > >     
> > > > Hi everybody,
> > > > 
> > > > thanks for all your responses... for now, I did not get the stall
> > > > again since this morning , it happens rarely but it happens enough
> > > > often to be annoying. Notice that I have none
> > > > resource configured. Except the both stonith resources (for a two nodes
> > > > cluster)
> > > > or 4 stonith resources (for a 4 nodes cluster).
> > > > 
> > > > My corosync release is :
> > > > corosync-1.2.0-1.el5
> > > > so on RHEL5, but I already have encountered the problem also of fc12.
> > > > 
> > > > And yes, it is under Pacemaker, and my rpms are :
> > > > pacemaker-1.0.8-2.el5
> > > > cluster-glue-1.0.3-1.el5
> > > > resource-agents-1.0.1-1.el5
> > > > and I also have (despite not useful for HA stack):
> > > > openais-1.1.0-1.el5
> > > > 
> > > > Thanks
> > > > Regards.
> > > > Alain
> > > > 
> > > > Jan Friesse a écrit :
> > > >       
> > > > > Alain,
> > > > > what version of corosync are you using?
> > > > > 
> > > > > Are you using pacemaker?
> > > > > 
> > > > > If you are using corosync 1.2.1 please try to send gdb bt of threads.
> > > > > 
> > > > > Regards,
> > > > >   Honza
> > > > > 
> > > > > Alain.Moulle wrote:
> > > > > 
> > > > >         
> > > > > > Hi,
> > > > > > 
> > > > > > When stopping corosync with /etc/init.d/corosync stop", I'm from 
> > > > > > time
> > > > > > to time stalled
> > > > > > during unload services :
> > > > > > Signaling Corosync Cluster Engine (corosync) to terminate: [  OK  ]
> > > > > > Waiting for corosync services to
> > > > > > unload:.................................
> > > > > > 
> > > > > > What could be the reasons ?
> > > > > > What could I do to avoid this ?
> > > > > > What could I do to force the unload without rebooting the node ?
> > > > > > 
> > > > > > Thanks for help.
> > > > > > Alain Moullé
> > > > > > _______________________________________________
> > > > > > Openais mailing list
> > > > > > [email protected]
> > > > > > https://lists.linux-foundation.org/mailman/listinfo/openais
> > > > > > 
> > > > > >           
> > > > > 
> > > > > 
> > > > >         
> > > _______________________________________________
> > > Openais mailing list
> > > [email protected]
> > > https://lists.linux-foundation.org/mailman/listinfo/openais
> > > 
> > >     
> > 
> > 
> >   
> 
> _______________________________________________
> Openais mailing list
> [email protected]
> https://lists.linux-foundation.org/mailman/listinfo/openais

_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais

Reply via email to