Hi Matt
hi hope this will be a little bit less stupid question.
I've installed lam-mpi on 6 SMP with linux redhat. During a test running nwchem (http://www.emsl.pnl.gov) on 8 processors, the frontend is crashed because a kernel bug. After the resurrection of the frontend and the kill of the lamd on the 5 slaves, I'm not able to reconnect my frontend and the slaves with ganglia to survey the cluster. It seems that after the correct boot of daemons (authd, gexecd and ganglia-rrd) ganglia-rrd is dead beacuse with 'status' option the answer is :

perl-rrd dead but subsys locked
in subsys perl-rrd obviously. And now i can't connect to the slaves with "telnet host port".
Does a firewall do this?
Any idea?

m.

Reply via email to