On Thu, Oct 21, 2010 at 11:57:56AM +0200, Senko Rasic wrote:
> Hi,
> 
> I'm replying directly to you with this, feel free to forward to the
> list. For further discussions I could also join the list.

Yes, it would be good.

> On 10/21/2010 11:34 AM, Dejan Muhamedagic wrote:
> >Senko Rasic proposed a patch for the client unregister which
> >would prevent a double unref of glib sources (IPC channel).
> >However, I cannot recall any deadlocks in lrmd.
> >
> >See http://www.init.hr/dev/cluster/patches/2444.diff
> >
> >Senko, is this something you observed or you just thought it
> >might occur?
> 
> This is what I've observed happening whenever I had g_type_init()
> invoked either in the lrmd itself, or in the plugin. g_type_init()
> initialises the glib type system, which is needed for the dbus glib
> proxies to work.
> 
> So, when g_type_init() was called earlier, the deadlock would
> eventually happen in the part of the code I patched.
> 
> It should be very easy to test; just try using lrmd with
> g_type_init() and without, and it shoudl be apparent it gets blocked
> after the first request, no matter which plugin serves the req.

Perhaps it would help to do g_type_init() on lrmd startup before
doing any glib2 related stuff (in particular the IPC). Not sure,
but it could be a problem with locking, i.e. that the deadlock
you saw was actually lrmd waiting on some mutex. lrmd is not a
threaded application and therefore has no g_thread_init().
Apparently, g_thread_init() is needed (or recommended) before
g_type_init().

There's also a recent patch which enables threading within
g_type_init():

http://osdir.com/ml/svn-commits-list/2010-01/msg03809.html

Could you give it a try and move g_type_init() to init_start()
before any other glib calls. Perhaps with g_thread_init() too.

Though I'm not sure if this is going to work at all.

Cheers,

Dejan

> >BTW, this same or very similar procedure is used by all programs
> >including pacemaker.
> 
> Right, the code I commented out didn't look obviusly wrong to me (so
> I added a comment about the potential ref leak). Unfortunately I
> don't know much about lrmd internals and haven't been able to reason
> out why this would be happening.
> 
> BR,
> Senko
> 
> -- 
> Senko Rasic, DobarKod d.o.o. <[email protected]>
> http://dobarkod.hr/
_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Reply via email to