Hi,
OK I'll try to reproduce it with backtrace.
In the meanwhile, in the same "shutdown corosync" subject,
it also occurs sometimes that the /etc/init.d/corosync stop
completes but the status gives :
corosync dead but subsys locked
In this case, is it acceptable to just do :
rm -f /var/lock/subsys/corosync
and then the status returns "corosync is stopped" as usual,
or is there anything else to more cleanness ?
Thanks
Alain
Steven Dake a écrit :
Alain,
We are aware of a newly discovered shutdown issue but don't yet have a
root cause of the problem. We haven't been able to reproduce it on our
equipment so as of yet we can't fix it.
If you could gather a backtrace of the corosync process during shutdown
that might help.
To do that, first install corosync-debuginfo package.
Then:
gdb
attach (the pid of the corosync process)
thread apply all bt
send output to list
Thanks
-steve
On Tue, 2010-05-04 at 16:06 +0200, Alain.Moulle wrote:
Yep, I've just updated all rpms with :
cluster-glue-1.0.5-1.el5.x86_64.rpm
corosynclib-1.2.1-1.el5.x86_64.rpm
resource-agents-1.0.3-2.el5.x86_64.rpm
cluster-glue-libs-1.0.5-1.el5.x86_64.rpm
pacemaker-1.0.8-6.el5.x86_64.rpm
corosync-1.2.1-1.el5.x86_64.rpm
pacemaker-libs-1.0.8-6.el5.x86_64.rpm
I'll keep you informed on this thread if the problem occurs again.
Thanks a lot
Alain
Andrew Beekhof a écrit :
Alain, clusterlabs has 1.2.1 now. Could you try updating?
On Tue, May 4, 2010 at 2:48 PM, Jan Friesse <[email protected]> wrote:
Hi,
1.2.0 has some shutdown issues. Try to upgrade to 1.2.1 (1.2.2 when
released), and problem should dissapeared.
Regards,
Honza
Alain.Moulle wrote:
Hi everybody,
thanks for all your responses... for now, I did not get the stall
again since this morning , it happens rarely but it happens enough
often to be annoying. Notice that I have none
resource configured. Except the both stonith resources (for a two nodes
cluster)
or 4 stonith resources (for a 4 nodes cluster).
My corosync release is :
corosync-1.2.0-1.el5
so on RHEL5, but I already have encountered the problem also of fc12.
And yes, it is under Pacemaker, and my rpms are :
pacemaker-1.0.8-2.el5
cluster-glue-1.0.3-1.el5
resource-agents-1.0.1-1.el5
and I also have (despite not useful for HA stack):
openais-1.1.0-1.el5
Thanks
Regards.
Alain
Jan Friesse a écrit :
Alain,
what version of corosync are you using?
Are you using pacemaker?
If you are using corosync 1.2.1 please try to send gdb bt of threads.
Regards,
Honza
Alain.Moulle wrote:
Hi,
When stopping corosync with /etc/init.d/corosync stop", I'm from time
to time stalled
during unload services :
Signaling Corosync Cluster Engine (corosync) to terminate: [ OK ]
Waiting for corosync services to
unload:.................................
What could be the reasons ?
What could I do to avoid this ?
What could I do to force the unload without rebooting the node ?
Thanks for help.
Alain Moullé
_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais
_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais
_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais
_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais