Re: [Openais] What can I do when facing "Waiting for corosync services to unload:........."

Alain.Moulle Wed, 05 May 2010 01:41:07 -0700

Hi,
OK I'll try to reproduce it with backtrace.

In the meanwhile, in the same "shutdown corosync" subject,
it also occurs sometimes that the /etc/init.d/corosync stop
completes but the status gives :
corosync dead but subsys locked
In this case, is it acceptable to just do :
rm -f /var/lock/subsys/corosync
and then the status returns "corosync is stopped" as usual,
or is there anything else to more cleanness ?


Thanks
Alain

Steven Dake a écrit :

Alain,

We are aware of a newly discovered shutdown issue but don't yet have a
root cause of the problem.  We haven't been able to reproduce it on our
equipment so as of yet we can't fix it.

If you could gather a backtrace of the corosync process during shutdown
that might help.

To do that, first install corosync-debuginfo package.

Then:

gdbattach (the pid of the corosync process)

thread apply all bt

send output to list

Thanks
-steve


On Tue, 2010-05-04 at 16:06 +0200, Alain.Moulle wrote:

Yep, I've just updated all rpms with :
cluster-glue-1.0.5-1.el5.x86_64.rpm
corosynclib-1.2.1-1.el5.x86_64.rpm
resource-agents-1.0.3-2.el5.x86_64.rpm
cluster-glue-libs-1.0.5-1.el5.x86_64.rpm
pacemaker-1.0.8-6.el5.x86_64.rpm
corosync-1.2.1-1.el5.x86_64.rpm
pacemaker-libs-1.0.8-6.el5.x86_64.rpm

I'll keep you informed on this thread if the problem occurs again.

Thanks a lot

Alain

Andrew Beekhof a écrit :

Alain, clusterlabs has 1.2.1 now.  Could you try updating?

On Tue, May 4, 2010 at 2:48 PM, Jan Friesse <[email protected]> wrote:

Hi,
1.2.0 has some shutdown issues. Try to upgrade to 1.2.1 (1.2.2 when
released), and problem should dissapeared.

Regards,
 Honza


Alain.Moulle wrote:

Hi everybody,

thanks for all your responses... for now, I did not get the stall
again since this morning , it happens rarely but it happens enough
often to be annoying. Notice that I have none
resource configured. Except the both stonith resources (for a two nodes
cluster)
or 4 stonith resources (for a 4 nodes cluster).

My corosync release is :
corosync-1.2.0-1.el5
so on RHEL5, but I already have encountered the problem also of fc12.

And yes, it is under Pacemaker, and my rpms are :
pacemaker-1.0.8-2.el5
cluster-glue-1.0.3-1.el5
resource-agents-1.0.1-1.el5
and I also have (despite not useful for HA stack):
openais-1.1.0-1.el5

Thanks
Regards.
Alain

Jan Friesse a écrit :

Alain,
what version of corosync are you using?

Are you using pacemaker?

If you are using corosync 1.2.1 please try to send gdb bt of threads.

Regards,
  Honza

Alain.Moulle wrote:

Hi,

When stopping corosync with /etc/init.d/corosync stop", I'm from time
to time stalled
during unload services :
Signaling Corosync Cluster Engine (corosync) to terminate: [  OK  ]
Waiting for corosync services to
unload:.................................

What could be the reasons ?
What could I do to avoid this ?
What could I do to force the unload without rebooting the node ?

Thanks for help.
Alain Moullé
_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais

_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais

_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais

_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais

Re: [Openais] What can I do when facing "Waiting for corosync services to unload:........."

Reply via email to