Hi Matt
hi hope this will be a little bit less stupid question.
I've installed lam-mpi on 6 SMP with linux redhat. During a test
running nwchem (http://www.emsl.pnl.gov) on 8 processors, the
frontend is crashed because a kernel bug.
After the resurrection of the frontend and the kill of the lamd on
the 5 slaves, I'm not able to reconnect my frontend and the slaves
with ganglia to survey the cluster.
It seems that after the correct boot of daemons (authd, gexecd and
ganglia-rrd) ganglia-rrd is dead beacuse with 'status' option the
answer is :
perl-rrd dead but subsys locked
in subsys perl-rrd obviously. And now i can't connect to the slaves
with "telnet host port".
Does a firewall do this?
Any idea?
m.