On Tue, 2010-05-11 at 07:48 +0200, Alain.Moulle wrote: > Hi, > FYI : me too, I have debug : on and I faced the problem on RHEL5 as well > as on fc12. > Alain
I have found the root cause I believe is related to your issues. Basically with debug:on the internal buffers inside logsys are overflowed, triggering a spinning condition and lack of proper logging operation. I am working on a solution now. Regards -steve > > Hi, > > > > I experienced the same issue on Redhat 5.5 PPC. > > I compiled all packages myself, since there are no ppc packages available > > in the clusterlabs repository. > > If Andrew will post his SRPM somewhere or maybe instructions how to compile > > it, I would be happy to contribute. > > > > Vadym > > > > On May 10, 2010, at 5:38 PM, Steven Dake wrote: > > > > > >> > It seems pretty clear from the mailing list traffic recently there is a > >> > critical flaw with the shutdown related in some way to Pacemaker and > >> > Corosync that happens on a few people's opensuse systems. It seems to > >> > only reproduce on opensuse however we don't know if it is limited to > >> > this platform. Finally we want Corosync to work perfectly for every > >> > Linux platform and will do everything possible to understand the > >> > specific environmental issues that are exposing bugs in Corosync. > >> > Unfortunately for several weeks we have been unable in our labs to > >> > reproduce this problem which means we need your help! > >> > > >> > The developers will work to resolve this problem at our highest priority > >> > and release a fix as soon as we can generate an adequate execution > >> > trace. > >> > > >> > We have a backtrace around where the issue occurred which presents us > >> > with enough data to get started. > >> > > >> > Our plans are as follows: > >> > Mon-Wed: Code review of suspected areas and instrumentation patch > >> > created > >> > Thu: Special build created by Andrew with the instrumentation patch for > >> > those people affected by this issue. > >> > We will begin analysis of the instrumentation results once we have a > >> > trace. > >> > > >> > I would really appreciate those people affected by this issue to run > >> > Andrew's special build of Corosync which will have more trace info in it > >> > when it is available. > >> > > >> > Regards > >> > -steve > >> > > >> > On Mon, 2010-05-10 at 14:26 +0200, Alain.Moulle wrote: > >> > >>> >> As soon as I got it again ... because it is strange, I did not face > >>> >> the problem > >>> >> again since this morning ! And besides I'm sure that on Friday I was > >>> >> in a case where > >>> >> the stop/cleanup (of a resource failed on start) enables the corosync > >>> >> shutdown to > >>> >> complete , and as long as I had not cleanup the failed resource, the > >>> >> corosync stop > >>> >> does not returns and was stalled in "Waiting for corosync services to > >>> >> unload:........ > >>> >> > >>> >> I'll keep you inform if I can find the conditions for this abnormal > >>> >> behavior. > >>> >> Thanks > >>> >> Regards > >>> >> Alain > >>> >> > >>> >> Andrew Beekhof a ?crit : > >>> > >>>> >>> On Mon, May 10, 2010 at 8:31 AM, Alain.Moulle > >>>> >>> <[email protected]> wrote: > >>>> >>> > >>>> > >>>>> >>>> I meant "/etc/init.d/corosync stop" never returns. > >>>>> >>>> > >>>>> > >>>> >>> > >>>> >>> Ok. Can you show us the logs and "ps axf" please? > >>>> >>> > >>>> >>> > >>>> >>> > >>>> > >>> >> > >>> > _______________________________________________ > Openais mailing list > [email protected] > https://lists.linux-foundation.org/mailman/listinfo/openais _______________________________________________ Openais mailing list [email protected] https://lists.linux-foundation.org/mailman/listinfo/openais
