On 16:16, Wed 24 Nov 10, Ante Karamatić wrote:
> U Sri, 24. 11. 2010., u 10:52 +0000, Dave Williams je napisao/la:
> 
> > I currently have a production clustered server down because of this and
> > the fact that ubuntu (I'm advised) have an inconsistently compiled set
> > of HA components. Certaintly both lucid and maverick released packages
> > leave defunct processes lying around and give highly unreliable
> > operation :-(
> 
> Can you elaborate on this?
> 
This is typical of the result of installing the pacemaker corosync cluster-glue
stack on otherwise reasonably clean machines.

root     20586  0.0  0.2 227056  5704 ?        Ssl  Nov17   4:39 
/usr/sbin/corosync
root     20593  0.0  0.0      0     0 ?        Z    Nov17   0:00  \_ [stonithd] 
<defunct>
108      20594  0.0  0.2  80624  4636 ?        S    Nov17   0:05  \_ 
/usr/lib/heartbeat/cib
root     20595  0.0  0.0      0     0 ?        Z    Nov17   0:00  \_ [lrmd] 
<defunct>
108      20596  0.0  0.1  81568  2780 ?        S    Nov17   0:06  \_ 
/usr/lib/heartbeat/attrd
108      20597  0.0  0.0      0     0 ?        Z    Nov17   0:00  \_ [pengine] 
<defunct>
108      20598  0.0  0.1  87796  3060 ?        S    Nov17   0:05  \_ 
/usr/lib/heartbeat/crmd
108      20601  0.0  0.2  81016  5696 ?        S    Nov17   0:30  \_ 
/usr/lib/heartbeat/cib
root     20602  0.0  0.0  36424  1340 ?        S    Nov17   0:07  \_ 
/usr/lib/heartbeat/lrmd
108      20603  0.0  0.1  81568  3296 ?        S    Nov17   0:00  \_ 
/usr/lib/heartbeat/attrd
108      20604  0.0  0.1  81916  2796 ?        S    Nov17   0:00  \_ 
/usr/lib/heartbeat/pengine
root     20608  0.0  0.0      0     0 ?        Z    Nov17   0:00  \_ [corosync] 
<defunct>
root     20609  0.0  0.0      0     0 ?        Z    Nov17   0:00  \_ [corosync] 
<defunct>
root     20613  0.0  0.0      0     0 ?        Z    Nov17   0:00  \_ [corosync] 
<defunct>

It is the same irrespective of lucid/maverick 
cluster-glue-with-upstart/without-upstart
and 32/64bits.  These are all on ubuntu-server not desktop.

> OTOH, upstart plugin in ubuntu packages include one patch that wasn't
> accepted upstream, cause of which upstart plugin works.
I appreciate the ubuntu cluster-glue package with upstart is new but sadly it 
wasnt
obvious there were problems with it on the various "announcements" I found. 

I guess I shouldnt be so optimistic and current HA stack is quite a change from
original heartbeat based solution I had and so a lot to learn. You know what is
like when there is pressure to get things going (in our case a serious hardware
failure which required complete server replacement) - you end up understanding 
the
absolute minimum required to reach your (customer/bosses) goals.

> 
> It's a known problem that upstream's version of cluster-glue doesn't
> work yet with upstart and, pointing at my self, we still didn't test the
> solution Dejan proposed. I'll do that today or tomorrow.

Maybe we can work in parallel on this. As I said I'm happy to assist where I 
can.
Whilst I am a seasoned software professional I am new to glib - so have a steep
learning curve to climb in that respect!
 
> Dejan, sorry for not respoding sooner. I'm having hard time finding some
> free time to work on this :(
Ditto :-(

> 
> 
> _______________________________________________________
> Linux-HA-Dev: [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Reply via email to