Hi Brad:
I took oscar-users off the
CC: list since this is discussion on an unreleased/development
version.
The step "Complete Cluster
Seup" should setup TORQUE with information regarding your compute nodes,
specially the script "post_install" in packages/torque/scripts should do
that. Perhaps it ran into some trouble during execution.
Can you post your updated
oscarinstall.log? That should provide more information regarding which
step failed.
Cheers,
Bernard
From: Brad Aisa [mailto:[EMAIL PROTECTED]
Sent: Sun 18/06/2006 14:30
To: oscar devel; oscar users
Cc: Bernard Li
Subject: Success!!! (w/ caveats...)
[Fedora
Core 5/Oscar 5]
Hey, so I was finally able to install my two test nodes via PXE!!!
The tests are not passing though --
The first issue, was that PBS didn't have my nodes in its nodes file --
it only had the head node (which I also assigned as a compute node)
what part of the Ocar Wizard was supposed to do that?
Note that Ganglia lists all 3 of my nodes
Here are the test results (after adding my nodes manually):
Performing root tests...
TORQUE node check [PASSED]
TORQUE service check:pbs_server [PASSED]
Maui service check:maui [PASSED]
/home mounts [PASSED]
Preparing user tests...
Performing user tests...
SSH ping test [PASSED]
SSH server->node [PASSED]
SSH node->server [PASSED]
Open MPI (via TORQUE) qsub: Job exceeds queue resource limits
Open MPI (via TORQUE) [FAILED]
Ganglia setup test [PASSED]
Ganglia node count test [PASSED]
TORQUE default queue definition [PASSED]
TORQUE Shell Test qsub: Job exceeds queue resource limits
TORQUE Shell Test [FAILED]
LAM/MPI (via TORQUE) qsub: Job exceeds queue resource limits
LAM/MPI (via TORQUE) [FAILED]
MPICH (via TORQUE) qsub: Job exceeds queue resource limits
MPICH (via TORQUE) [FAILED]
...Hit <ENTER> key to exit...
Brad Aisa
baisa at brad-aisa dot com
Hey, so I was finally able to install my two test nodes via PXE!!!
The tests are not passing though --
The first issue, was that PBS didn't have my nodes in its nodes file --
it only had the head node (which I also assigned as a compute node)
what part of the Ocar Wizard was supposed to do that?
Note that Ganglia lists all 3 of my nodes
Here are the test results (after adding my nodes manually):
Performing root tests...
TORQUE node check [PASSED]
TORQUE service check:pbs_server [PASSED]
Maui service check:maui [PASSED]
/home mounts [PASSED]
Preparing user tests...
Performing user tests...
SSH ping test [PASSED]
SSH server->node [PASSED]
SSH node->server [PASSED]
Open MPI (via TORQUE) qsub: Job exceeds queue resource limits
Open MPI (via TORQUE) [FAILED]
Ganglia setup test [PASSED]
Ganglia node count test [PASSED]
TORQUE default queue definition [PASSED]
TORQUE Shell Test qsub: Job exceeds queue resource limits
TORQUE Shell Test [FAILED]
LAM/MPI (via TORQUE) qsub: Job exceeds queue resource limits
LAM/MPI (via TORQUE) [FAILED]
MPICH (via TORQUE) qsub: Job exceeds queue resource limits
MPICH (via TORQUE) [FAILED]
...Hit <ENTER> key to exit...
baisa at brad-aisa dot com
_______________________________________________ Oscar-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/oscar-devel
