Bernard Li wrote:
Can you show us the output of pbsnodes -a now that you have resolved the problem with /etc/hosts? I think your guess is right, and that the two new nodes cannot communicate with your pbs_server - did you check their error logs located in /var/spool/pbs?

Also, post lamtest.err and mpichtest.err as well, they may help to figure out what's wrong.

it reported lamd wasn't runnign on a compute node.
but isn;t this started by the pbs ?

You did re-run "complete cluster setup" after your 2 new nodes were added, right?

(a few times!!)



i decided to go from scratch and rebuild the whole thing
all works fine now.

4 compute nodes.

(4 hours total include head node Linux install and downloading full RPM set for clients)

i have 20 more nodes to add next week/

now all i got to do is remember the project i built the cluster for !!


/nc


-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
Oscar-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/oscar-users

Reply via email to