On Tue, Nov 23, 2010 at 11:36 PM, Dave Williams <d...@opensourcesolutions.co.uk> wrote: > On 21:59, Mon 22 Nov 10, Dave Williams wrote: >> backtrace from gdb shows lrmd to be in a lock_wait >> #0 0x00007f7e5f8ba6b4 in __lll_lock_wait () from /lib/libpthread.so.0 >> #1 0x00007f7e5f8b5849 in _L_lock_953 () from /lib/libpthread.so.0 >> #2 0x00007f7e5f8b566b in pthread_mutex_lock () from >> /lib/libpthread.so.0 >> #3 0x00007f7e601b0806 in g_main_context_find_source_by_id () from >> /lib/libglib-2.0.so.0 >> #4 0x00007f7e601b08fe in g_source_remove () from /lib/libglib-2.0.so.0 >> #5 0x00007f7e61568ba1 in G_main_del_IPC_Channel (chp=0x11deed0) at >> GSource.c:495 >> #6 0x00000000004065a1 in on_remove_client (user_data=0x11df8e0) at >> lrmd.c:1526 >> #7 0x00007f7e615694ca in G_CH_destroy_int (source=0x11deed0) at >> GSource.c:675 >> #8 0x00007f7e601adc11 in ?? () from /lib/libglib-2.0.so.0 >> #9 0x00007f7e601ae428 in g_main_context_dispatch () from >> /lib/libglib-2.0.so.0 >> #10 0x00007f7e601b22a8 in ?? () from /lib/libglib-2.0.so.0 >> #11 0x00007f7e601b27b5 in g_main_loop_run () from /lib/libglib-2.0.so.0 >> #12 0x0000000000405d32 in init_start () at lrmd.c:1267 >> #13 0x0000000000404f7a in main (argc=1, argv=0x7fff91e24478) at >> lrmd.c:835 >> > > OK - what I understand having spent an evening looking at the source > code is that upon lrmadmin client disconnecting from lrmd's cmd socket > (having got what it needs) lrmd is left to tidy up by deleting the client > event source from the GMainContext GLib loop. It is in the process of > calling g_source_remove() which then hangs deep inside GLib on a mutex > lock. > > On the surface the overall sequence makes sense but the hang doesnt and > clearly shouldnt happen. I am at a loss as to whether it is a GLib > issues (unlikely I would have thought?) or its an lrmd bug. > > lrmd should NEVER hang! Can anyone help? > > Are there any other mailing lists I can try??
They are discussing something similar over on linux-ha. Are you using upstart resources in the cluster by any chance? _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker