On Fri, 29 Apr 2005, Bernard Li wrote:
> Hey Jeremy:
>
> What does pbsnodes -a give you?
[EMAIL PROTECTED] oscartst]# pbsnodes -a | more
node1
state = state-unknown,down
np = 2
properties = all
ntype = cluster
node2
state = state-unknown,down
np = 2
properties = all
ntype = cluster
node3
state = state-unknown,down
np = 2
properties = all
ntype = cluster
node4
state = state-unknown,down
np = 2
properties = all
ntype = cluster
node5
state = state-unknown,down
np = 2
properties = all
ntype = cluster
node6
state = state-unknown,down
np = 2
properties = all
ntype = cluster
node7
state = state-unknown,down
np = 2
properties = all
ntype = cluster
node8
state = state-unknown,down
np = 2
properties = all
ntype = cluster
node9
state = state-unknown,down
np = 2
properties = all
ntype = cluster
node10
state = state-unknown,down
np = 2
properties = all
ntype = cluster
Note, I'm purposely removing the domain from this output.
> Also, try running the PBS/Torque GUI's (like xpbs, xpbsmon) and see if
> the nodes are set up properly... it seems like the qmaster is set up
> but none of the execution nodes are set up...
xpbsmon sees all the nodes, but their state is black or "noinfo".
xpbs sees the server, sees the queues, but doesn't seem to see any jobs,
although qstat still show three jobs in the queue. The queue in xpbs
shows 3, but no jobs.
Here's the output:
[EMAIL PROTECTED] oscartst]# xpbs_datadump -t 30 -u root pbs_oscar
:Server Max Tot Que Run Hld Wat Trn Ext
Status PEsInUse
oscar-control 0 3 3 0 0 0 0 0
Active 0/20 oscar-control
:Queue Max Tot Ena Str Que Run Hld Wat Trn
Ext Type Server
workq 0 3 yes yes 3 0 0 0 0
0 Execution oscar-control
Thanks for all your help.
-jeremy
> Cheers,
>
> Bernard
>
> > -----Original Message-----
> > From: Jeremy Hansen [mailto:[EMAIL PROTECTED]
> > Sent: Friday, April 29, 2005 13:38
> > To: Bernard Li
> > Cc: [email protected]
> > Subject: RE: [Oscar-users] Not enough free nodes. Tests incomplete.
> >
> > On Fri, 29 Apr 2005, Bernard Li wrote:
> >
> > > Hi Jeremy:
> > >
> > > > on the master node /etc/hosts looks like this:
> > > >
> > > > 10.2.6.199 oscar-control.blah.com oscar-control oscar_server
> > > > nfs_oscar pbs_oscar
> > > > 172.21.184.192 oscar-control.blah.com oscar-control
> > > >
> > > > # These entries are managed by SIS, please don't modify them.
> > > > 10.2.6.1 node1.blah.com node1
> > > > 10.2.6.2 node2.blah.com node2
> > > > 10.2.6.3 node3.blah.com node3
> > > > 10.2.6.4 node4.blah.com node4
> > > > 10.2.6.5 node5.blah.com node5
> > > > 10.2.6.6 node6.blah.com node6
> > > > 10.2.6.7 node7.blah.com node7
> > > > 10.2.6.8 node8.blah.com node8
> > > > 10.2.6.9 node9.blah.com node9
> > > > 10.2.6.10 node10.blah.com node10
> > >
> > > I don't see a 127.0.0.1... It needs to be there.
> >
> > It's in there...just skipped it on the paste.
> >
> > > > The *.err files in
> > > > oscartst are zero length.
> > > >
> > > > Where would I find more log files?
> > >
> > > Since these are PBS/Torque errors, one place to look for logs is
> > > /var/spool/pbs/server_logs.
> >
> > Output from the server_log during the test:
> >
> > 04/29/2005
> > 13:15:15;0040;PBS_Server;Svr;oscar-control;Scheduler
> > sent command time
> > 04/29/2005
> > 13:16:15;0040;PBS_Server;Svr;oscar-control;Scheduler
> > sent command time
> > 04/29/2005
> > 13:17:15;0040;PBS_Server;Svr;oscar-control;Scheduler
> > sent command time
> > 04/29/2005
> > 13:18:15;0040;PBS_Server;Svr;oscar-control;Scheduler
> > sent command time
> > 04/29/2005
> > 13:19:15;0040;PBS_Server;Svr;oscar-control;Scheduler
> > sent command time
> > 04/29/2005
> > 13:20:15;0040;PBS_Server;Svr;oscar-control;Scheduler
> > sent command time
> > 04/29/2005
> > 13:21:15;0040;PBS_Server;Svr;oscar-control;Scheduler
> > sent command time
> > 04/29/2005
> > 13:22:15;0040;PBS_Server;Svr;oscar-control;Scheduler
> > sent command time
> > 04/29/2005
> > 13:23:15;0040;PBS_Server;Svr;oscar-control;Scheduler
> > sent command time
> > 04/29/2005
> > 13:24:15;0040;PBS_Server;Svr;oscar-control;Scheduler
> > sent command time
> >
> >
> >
> >
> > 04/29/2005 13:24:23;0002;PBS_Server;Svr;PBS_Server;Server
> > shutdown completed
> > 04/29/2005 13:24:23;0002;PBS_Server;Svr;Log;Log closed
> > 04/29/2005 13:24:23;0002;PBS_Server;Svr;Log;Log opened
> > 04/29/2005 13:24:23;0006;PBS_Server;Svr;PBS_Server;Server
> > oscar-control started, initialization type = 1
> > 04/29/2005 13:24:23;0002;PBS_Server;Svr;Act;Account file
> > /var/spool/pbs/server_priv/accounting/20050429 opened
> > 04/29/2005 13:24:23;0040;PBS_Server;Req;setup_nodes;setup_nodes()
> >
> > 04/29/2005 13:24:23;0002;PBS_Server;Svr;PBS_Server;Server Ready, pid =
> > 4391
> > 04/29/2005
> > 13:24:23;0040;PBS_Server;Svr;oscar-control;Scheduler
> > sent command scheduler_first
> > 04/29/2005
> > 13:25:23;0040;PBS_Server;Svr;oscar-control;Scheduler
> > sent command time
> > 04/29/2005
> > 13:26:23;0040;PBS_Server;Svr;oscar-control;Scheduler
> > sent command time
> > 04/29/2005
> > 13:27:23;0040;PBS_Server;Svr;oscar-control;Scheduler
> > sent command time
> > 04/29/2005
> > 13:28:23;0040;PBS_Server;Svr;oscar-control;Scheduler
> > sent command time
> > 04/29/2005
> > 13:29:23;0040;PBS_Server;Svr;oscar-control;Scheduler
> > sent command time
> >
> > I don't see anything usual in this log. One thing I've
> > noticed, and perhaps this is just my ignorance on how it
> > functions at the moment, but my submitted jobs do not appear
> > to get scheduled when submitted. Just a simple echo script
> > sits in the queue:
> >
> > [EMAIL PROTECTED] oscartst]$ qstat
> > Job id Name User Time Use S Queue
> > ---------------- ---------------- ---------------- -------- - -----
> > 21.oscar-control test.sh oscartst 0 Q workq
> > 22.oscar-control test.sh oscartst 0 Q workq
> >
> > 04/29/2005
> > 13:36:59;0040;PBS_Server;Svr;oscar-control;Scheduler
> > sent command new
> >
> >
> > > Cheers,
> > >
> > > Bernard
> > >
> > >
> > > -------------------------------------------------------
> > > This SF.Net email is sponsored by: NEC IT Guy Games.
> > > Get your fingers limbered up and give it your best shot. 4 great
> > > events, 4 opportunities to win big! Highest score wins.NEC IT Guy
> > > Games. Play to win an NEC 61 plasma display. Visit
> > > http://www.necitguy.com/?r
> > > _______________________________________________
> > > Oscar-users mailing list
> > > [email protected]
> > > https://lists.sourceforge.net/lists/listinfo/oscar-users
> > >
> >
> >
>
-------------------------------------------------------
This SF.Net email is sponsored by: NEC IT Guy Games.
Get your fingers limbered up and give it your best shot. 4 great events, 4
opportunities to win big! Highest score wins.NEC IT Guy Games. Play to
win an NEC 61 plasma display. Visit http://www.necitguy.com/?r=20
_______________________________________________
Oscar-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/oscar-users