On Fri, 29 Apr 2005, Bernard Li wrote:

> Hey Jeremy:
> 
> What does pbsnodes -a give you?

[EMAIL PROTECTED] oscartst]# pbsnodes -a | more
node1
     state = state-unknown,down
     np = 2
     properties = all
     ntype = cluster

node2
     state = state-unknown,down
     np = 2
     properties = all
     ntype = cluster

node3
     state = state-unknown,down
     np = 2
     properties = all
     ntype = cluster

node4
     state = state-unknown,down
     np = 2
     properties = all
     ntype = cluster

node5
     state = state-unknown,down
     np = 2
     properties = all
     ntype = cluster

node6
     state = state-unknown,down
     np = 2
     properties = all
     ntype = cluster

node7
     state = state-unknown,down
     np = 2
     properties = all
     ntype = cluster

node8
     state = state-unknown,down
     np = 2
     properties = all
     ntype = cluster

node9
     state = state-unknown,down
     np = 2
     properties = all
     ntype = cluster

node10
     state = state-unknown,down
     np = 2
     properties = all
     ntype = cluster

Note, I'm purposely removing the domain from this output.

> Also, try running the PBS/Torque GUI's (like xpbs, xpbsmon) and see if
> the nodes are set up properly...  it seems like the qmaster is set up
> but none of the execution nodes are set up...

xpbsmon sees all the nodes, but their state is black or "noinfo".

xpbs sees the server, sees the queues, but doesn't seem to see any jobs, 
although qstat still show three jobs in the queue.  The queue in xpbs 
shows 3, but no jobs.

Here's the output:

[EMAIL PROTECTED] oscartst]# xpbs_datadump -t 30 -u root pbs_oscar
:Server                    Max   Tot   Que   Run   Hld   Wat   Trn   Ext 
Status      PEsInUse
oscar-control               0     3     3     0     0     0     0     0 
Active          0/20                    oscar-control
:Queue              Max   Tot   Ena   Str   Que   Run   Hld   Wat   Trn   
Ext Type       Server
workq                0     3   yes   yes     3     0     0     0     0     
0 Execution  oscar-control


Thanks for all your help.

-jeremy

> Cheers,
> 
> Bernard 
> 
> > -----Original Message-----
> > From: Jeremy Hansen [mailto:[EMAIL PROTECTED] 
> > Sent: Friday, April 29, 2005 13:38
> > To: Bernard Li
> > Cc: [email protected]
> > Subject: RE: [Oscar-users] Not enough free nodes. Tests incomplete.
> > 
> > On Fri, 29 Apr 2005, Bernard Li wrote:
> > 
> > > Hi Jeremy: 
> > > 
> > > > on the master node /etc/hosts looks like this:
> > > > 
> > > > 10.2.6.199 oscar-control.blah.com oscar-control oscar_server 
> > > > nfs_oscar pbs_oscar
> > > > 172.21.184.192          oscar-control.blah.com oscar-control
> > > > 
> > > > # These entries are managed by SIS, please don't modify them.
> > > > 10.2.6.1             node1.blah.com  node1
> > > > 10.2.6.2             node2.blah.com  node2
> > > > 10.2.6.3             node3.blah.com  node3
> > > > 10.2.6.4             node4.blah.com  node4
> > > > 10.2.6.5             node5.blah.com  node5
> > > > 10.2.6.6             node6.blah.com  node6
> > > > 10.2.6.7             node7.blah.com  node7
> > > > 10.2.6.8             node8.blah.com  node8
> > > > 10.2.6.9             node9.blah.com  node9
> > > > 10.2.6.10            node10.blah.com node10
> > > 
> > > I don't see a 127.0.0.1...  It needs to be there.
> > 
> > It's in there...just skipped it on the paste.
> > 
> > > > The *.err files in
> > > > oscartst are zero length.
> > > > 
> > > > Where would I find more log files?
> > > 
> > > Since these are PBS/Torque errors, one place to look for logs is 
> > > /var/spool/pbs/server_logs.
> > 
> > Output from the server_log during the test:
> > 
> > 04/29/2005
> > 13:15:15;0040;PBS_Server;Svr;oscar-control;Scheduler
> > sent command time
> > 04/29/2005
> > 13:16:15;0040;PBS_Server;Svr;oscar-control;Scheduler
> > sent command time
> > 04/29/2005
> > 13:17:15;0040;PBS_Server;Svr;oscar-control;Scheduler
> > sent command time
> > 04/29/2005
> > 13:18:15;0040;PBS_Server;Svr;oscar-control;Scheduler
> > sent command time
> > 04/29/2005
> > 13:19:15;0040;PBS_Server;Svr;oscar-control;Scheduler
> > sent command time
> > 04/29/2005
> > 13:20:15;0040;PBS_Server;Svr;oscar-control;Scheduler
> > sent command time
> > 04/29/2005
> > 13:21:15;0040;PBS_Server;Svr;oscar-control;Scheduler
> > sent command time
> > 04/29/2005
> > 13:22:15;0040;PBS_Server;Svr;oscar-control;Scheduler
> > sent command time
> > 04/29/2005
> > 13:23:15;0040;PBS_Server;Svr;oscar-control;Scheduler
> > sent command time
> > 04/29/2005
> > 13:24:15;0040;PBS_Server;Svr;oscar-control;Scheduler
> > sent command time
> > 
> > 
> > 
> > 
> > 04/29/2005 13:24:23;0002;PBS_Server;Svr;PBS_Server;Server 
> > shutdown completed
> > 04/29/2005 13:24:23;0002;PBS_Server;Svr;Log;Log closed
> > 04/29/2005 13:24:23;0002;PBS_Server;Svr;Log;Log opened
> > 04/29/2005 13:24:23;0006;PBS_Server;Svr;PBS_Server;Server
> > oscar-control started, initialization type = 1
> > 04/29/2005 13:24:23;0002;PBS_Server;Svr;Act;Account file
> > /var/spool/pbs/server_priv/accounting/20050429 opened
> > 04/29/2005 13:24:23;0040;PBS_Server;Req;setup_nodes;setup_nodes()
> > 
> > 04/29/2005 13:24:23;0002;PBS_Server;Svr;PBS_Server;Server Ready, pid =
> > 4391
> > 04/29/2005
> > 13:24:23;0040;PBS_Server;Svr;oscar-control;Scheduler
> > sent command scheduler_first
> > 04/29/2005
> > 13:25:23;0040;PBS_Server;Svr;oscar-control;Scheduler
> > sent command time
> > 04/29/2005
> > 13:26:23;0040;PBS_Server;Svr;oscar-control;Scheduler
> > sent command time
> > 04/29/2005
> > 13:27:23;0040;PBS_Server;Svr;oscar-control;Scheduler
> > sent command time
> > 04/29/2005
> > 13:28:23;0040;PBS_Server;Svr;oscar-control;Scheduler
> > sent command time
> > 04/29/2005
> > 13:29:23;0040;PBS_Server;Svr;oscar-control;Scheduler
> > sent command time
> > 
> > I don't see anything usual in this log.  One thing I've 
> > noticed, and perhaps this is just my ignorance on how it 
> > functions at the moment, but my submitted jobs do not appear 
> > to get scheduled when submitted.  Just a simple echo script 
> > sits in the queue:
> > 
> > [EMAIL PROTECTED] oscartst]$ qstat
> > Job id           Name             User             Time Use S Queue
> > ---------------- ---------------- ---------------- -------- - -----
> > 21.oscar-control   test.sh          oscartst                0 Q workq
> > 22.oscar-control   test.sh          oscartst                0 Q workq
> > 
> > 04/29/2005
> > 13:36:59;0040;PBS_Server;Svr;oscar-control;Scheduler
> > sent command new
> > 
> > 
> > > Cheers,
> > > 
> > > Bernard
> > > 
> > > 
> > > -------------------------------------------------------
> > > This SF.Net email is sponsored by: NEC IT Guy Games.
> > > Get your fingers limbered up and give it your best shot. 4 great 
> > > events, 4 opportunities to win big! Highest score wins.NEC IT Guy 
> > > Games. Play to win an NEC 61 plasma display. Visit 
> > > http://www.necitguy.com/?r 
> > > _______________________________________________
> > > Oscar-users mailing list
> > > [email protected]
> > > https://lists.sourceforge.net/lists/listinfo/oscar-users
> > > 
> > 
> > 
> 



-------------------------------------------------------
This SF.Net email is sponsored by: NEC IT Guy Games.
Get your fingers limbered up and give it your best shot. 4 great events, 4
opportunities to win big! Highest score wins.NEC IT Guy Games. Play to
win an NEC 61 plasma display. Visit http://www.necitguy.com/?r=20
_______________________________________________
Oscar-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/oscar-users

Reply via email to