Fabio M. Di Nitto wrote: > On Fri, 2009-02-27 at 09:52 -0600, David Teigland wrote: >> On Fri, Feb 27, 2009 at 12:54:20PM +0000, Chrissie Caulfield wrote: >>>>>> Given the time at which fence_node -U will fire, you probably want to >>>>>> add a cman_init + cman_is_active + cman_finish loop in fence_node to >>>>>> make sure cman is ready to reply to our ccs queries, otherwise we might >>>>>> have a race condition at boot time (it might be already there.. didn't >>>>>> really check the code). All our daemons do that to give cman time to >>>>>> bootstrap. >>>>> Yes, good point. I wonder if we'd be better off having cman_tool join >>>>> effectively do an is_active wait before exiting? Then we could probably >>>>> avoid doing it many other places. (It's also annoying when corosync >>>>> crashes >>>>> after is_active completes, but before I've read what I need from >>>>> cman/ccs.) >>> Err, cman_tool already does this with the -w switch, and the init script >>> uses it. >> Great, so the constant flogging to add cman_is_active checks everywhere will >> end!? Can I remove all my cman_is_active loops? > > This works fine via init script. We could theoretically kill all those > loops but at least for us developers, that start stuff by hand, they > could still be useful.. and maybe a good failsafe if we ask users to run > something manually for debugging.. dunno.. just a thought. I don't have > a strong opinion on this matter. >
You might as well take them out to be honest. Those loops are mostly overspill from the RHEL4 cman where cman started up but could take 20-30 seconds to start or join a cluster. With openais/corosync once the daemon is up then you can talk to it. It might not be quorate ... but that IS your problem :-) Chrissie
