On Wed, 2009-04-01 at 00:12 +0200, Lars Marowsky-Bree wrote:
> On 2009-03-22T09:29:22, Lars Marowsky-Bree <[email protected]> wrote:
>
> So that they don't drop of the radar: Just wanted to point out that the
> CPG crashes are still around, mostly the pi corruption manifesting
> itself.
> ce
Yes I was out the previous week and catching up last week. I believe
how we need to handle this is try to understand the test scenario in
detail so we can duplicate it using tests like testcpg outside of the
cluster framework.
That allows us to debug it ourselves without relying on you building a
complete rpm to give us a trace.
If you used corosync, we could use the logsys tracing system. If we are
unable to reproduce the problem in our own environments, I'll backport
logsys (very simple) and we can generate better log output via logsys by
fully instrumenting cpg.
We will find a solution to this problem.
Regards
-steve
> This is with r1761 whitetank.
>
> > Thread 1 (Thread 6988):
> > #0 0x00007fc07fdfc667 in do_proc_join (name=0x7fff91153bf0, pid=19546,
> > nodeid=7, reason=1) at cpg.c:726
> > 726 if (pi->pid == pid && pi->nodeid == nodeid) {
>
> > #0 0x00007fc69f1b813c in notify_lib_joinlist (gi=0x786ea0, conn=0x0,
> > joined_list_entries=1,
> > joined_list=0x7fffb04482f0, left_list_entries=0, left_list=0x0, id=4)
> > at cpg.c:386
> > 386 if (pi->pid)
>
> #0 message_handler_req_exec_cpg_mcast (message=0x7fff5bae6aa0,
> nodeid=4) at cpg.c:917
> 917 if (pi->trackerconn && (pi->flags &
> PI_FLAG_MEMBER)) {
>
> ...
>
> etc.
>
>
> Regards,
> Lars
>
_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais