On Thu, Nov 25, 2010 at 09:47:51AM +0100, Andrew Beekhof wrote: > On Wed, Nov 24, 2010 at 2:18 PM, Dejan Muhamedagic <[email protected]> > wrote: > > Hi, > > > > On Wed, Nov 24, 2010 at 10:52:23AM +0000, Dave Williams wrote: > >> On 10:35, Wed 24 Nov 10, Dejan Muhamedagic wrote: > >> > Hi, > >> > > >> > On Tue, Nov 23, 2010 at 11:03:33PM +0000, Dave Williams wrote: > >> > > Hi, > >> > > I have a problem that looks similar to that reported "possible deadlock > >> > > in lrmd" on 21st Oct > >> > > > >> > > When running lradmin -C to list classes the first time it comes back > >> > > immediately with the expected list e.g. > >> > > > >> > > r...@node1:/home# lrmadmin -C > >> > > There are 5 RA classes supported: > >> > > lsb > >> > > ocf > >> > > stonith > >> > > upstart > >> > > heartbeat > >> > > > >> > > All subsequent attempts lrmadmin hangs and never comes back (you have > >> > > to kill > >> > > with crtl-C). This is repeatable on all the machines I have tried it > >> > > on. > >> > > >> > I'm afraid that this was to be expected. > >> Hi Dejan - thanks for your reply. > >> > >> I'm not sure which you imply: > >> a) Its known to be buggy? > >> b) Its working as designed? > >> I presume a). > > > > It is somewhat technical, but basically it's a). > > > >> ..... > >> > >> > > On the surface the overall sequence makes sense but the hang doesnt and > >> > > clearly shouldnt happen. I am at a loss as to whether it is a GLib > >> > > issues (unlikely I would have thought?) or its an lrmd bug. > >> > > >> > It's neither. It's bad usage of glib. > >> > > >> Is there anyone working on resolving this? I'm happy to help but dont > >> have the time to debug further at present - not being a glib expert. > >> I have other critical software projects to work on and just need > >> something that works in this area! > >> > >> > > IMHO lrmd should NEVER hang! > >> > > >> > If you don't use upstart, it won't hang. > >> > >> Sadly I need upstart. Thats one reason I got into this situation in the > >> first place! > >> > >> I currently have a production clustered server down because of this and > >> the fact that ubuntu (I'm advised) have an inconsistently compiled set > >> of HA components. Certaintly both lucid and maverick released packages > >> leave defunct processes lying around and give highly unreliable > >> operation :-( > > > > The most plausible explanation is in this thread: > > http://marc.info/?l=linux-ha-dev&m=128765996706209&w=2 > > > > The author didn't do anything yet about it, but hopefully it is > > going to change. > > > > Do I even want to know why the lrmd needs to be calling g_type_init() ?
Probably not... So that it can initialize some stuff _before_ using glib. When the upstart plugin invokes g_type_ it's already too late. Thanks, Dejan _______________________________________________________ Linux-HA-Dev: [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
