From: Neil Costigan [mailto:[EMAIL PROTECTED]
Sent: Wed 05/04/2006 04:58
To: Bernard Li
Cc: [email protected]
Subject: Re: [Oscar-users] help!. building client image (scientific linux 305)
Bernard Li wrote:
>What's the output of 'pbsnodes
-a'?
>
>
>
pbsnodes -a returns that all are
unknown or down
[EMAIL PROTECTED] oscar]# pbsnodes
-a
cc001.pg-207.computing.dcu.ie
state =
state-unknown,down
np =
1
properties = all
ntype
= cluster
cc002.pg-207.computing.dcu.ie
state
= state-unknown,down
np =
1
properties = all
ntype
= cluster
cc003.pg-207.computing.dcu.ie
state
= state-unknown,down
np =
1
properties = all
ntype
= cluster
cc004.pg-207.computing.dcu.ie
state
= state-unknown,down
np =
1
properties = all
ntype
= cluster
>Is pbs_mom running on all your client
nodes?
>
>
>
a ps aux | grep pbs_mon on all nodes
shows it is.
i have tried moving the pbs_oscar alias from the private
to the public
address in /etc/hosts
with no success
to
recap.
* OSCAR version 4.2.1b5
*
Fedora Core 3
* x86
- successfully passed
test_cluster after inital set up with head node
and two compute nodes. happy
days.
- test fails after adding two new nodes which are up and alive.
can
mount /home and pass ssh pings, pvm etc.
but fail
pbs
/opt/pbs/bin/pbsnodes: cannot connect to server pbs_oscar,
error=111
then fails with not enough free
nodes.
/nc
>Cheers,
>
>Bernard
>
>
>
>>well
it was going well
>>
>>i added two more nodes
>>and
now it fails
>>
>>[EMAIL PROTECTED] oscar]#
testing/test_cluster
>>Performing root tests...
>>Maui
service
>>check:maui
>>
>>
[PASSED]
>>Shutting down TORQUE
Server:
[ OK ]
>>Connection
refused
>>/opt/pbs/bin/pbsnodes: cannot connect to server pbs_oscar,
error=111
>>Torque
node
>>check
>>
>>
[PASSED]
>>Starting TORQUE
Server:
[ OK ]
>>Torque
service
>>check:pbs_server
>>
>>
[PASSED]
>>/home
>>mounts
>>
>>
[PASSED]
>>
>>Preparing user tests...
>>Performing
user tests...
>>SSH
ping
>>test
>>
>>
[PASSED]
>>SSH server-
>>
>node
>>
>>
[PASSED]
>>SSH node-
>>
>server
>>
>>
[PASSED]
>>Checking for 4
free
>>nodes:
>>
>>
[FAILED]
>>Not enough free nodes. Tests incomplete.
>>Checking
for 4
free
>>nodes:
>>
>>
[FAILED]
>>Not enough free nodes. Tests incomplete.
>>Checking
for 4
free
>>nodes:
>>
>>
[FAILED]
>>Not enough free nodes. Tests incomplete.
>>Torque
default
queue
>>definition
>>
>>
[PASSED]
>>Checking for 4
free
>>nodes:
>>
>>
[FAILED]
>>Not enough free nodes. Tests incomplete.
>>Ganglia
setup
>>test
>>
>>
[PASSED]
>>Ganglia node
count
>>test
>>
>>
[PASSED]
>>
>>
>>
