I fixed the problem it seems.

/var/spool/pbs/mom_priv/config had the external interface on the master 
node defined and not the internal private address.  It should have just 
used pbs_oscar for the hostname but instead used the fqdn of the public 
interface.  Perhaps a bug?

Changed this, restarted pbs_mom on all nodes and now I get what I expect.

Thanks
-jeremy

On Fri, 29 Apr 2005, Bernard Li wrote:

> Well notice that according to Torque/PBS - the state of your nodes are
> 'unknown/down'.  You need to figure out why that's the case and how to
> make them become 'available' again.
> 
> Cheers,
> 
> Bernard 
> 
> > -----Original Message-----
> > From: Jeremy Hansen [mailto:[EMAIL PROTECTED] 
> > Sent: Friday, April 29, 2005 14:55
> > To: Bernard Li
> > Cc: [email protected]
> > Subject: RE: [Oscar-users] Not enough free nodes. Tests incomplete.
> > 
> > 
> > I also noticed this using maui's checkjob
> > 
> > [EMAIL PROTECTED] bin]# ./checkjob 31
> > 
> > 
> > checking job 31
> > 
> > State: Idle  (User: oscartst  Group: oscartst)
> > WallTime: 0:00:00 of   INFINITY
> > SubmitTime: Fri Apr 29 14:52:46
> >   (Time Queued  Total: 0:00:23  Eligible: 0:00:00)
> > 
> > Total Tasks: 1
> > 
> > Req[0]  TaskCount: 1  Partition: ALL
> > Network: [NONE]  Memory >= 0  Disk >= 0  Swap NC 0
> > Opsys: [NONE]  Arch: [NONE]  Class: [workq 1]  Features: [NONE]
> > 
> > 
> > IWD: [NONE]  Executable:  [NONE]
> > QOS: DEFAULT  Bypass: 0  StartCount: 0
> > PartitionMask: [ALL]
> > Flags:       RESTARTABLE
> > 
> > job is deferred.  Reason:  NoResources  (exceeds available partition
> > procs)
> > Holds:    Defer
> > PE:  1.00  StartPriority:  1
> > cannot select job 31 for partition DEFAULT (job hold active)
> > 
> > Job is deferred because there are no resources???
> > 
> > -jeremy
> > 
> > On Fri, 29 Apr 2005, Bernard Li wrote:
> > 
> > > Hey Jeremy:
> > > 
> > > What does pbsnodes -a give you?
> > > 
> > > Also, try running the PBS/Torque GUI's (like xpbs, xpbsmon) 
> > and see if 
> > > the nodes are set up properly...  it seems like the qmaster 
> > is set up 
> > > but none of the execution nodes are set up...
> > > 
> > > Cheers,
> > > 
> > > Bernard
> > > 
> > > > -----Original Message-----
> > > > From: Jeremy Hansen [mailto:[EMAIL PROTECTED]
> > > > Sent: Friday, April 29, 2005 13:38
> > > > To: Bernard Li
> > > > Cc: [email protected]
> > > > Subject: RE: [Oscar-users] Not enough free nodes. Tests 
> > incomplete.
> > > > 
> > > > On Fri, 29 Apr 2005, Bernard Li wrote:
> > > > 
> > > > > Hi Jeremy: 
> > > > > 
> > > > > > on the master node /etc/hosts looks like this:
> > > > > > 
> > > > > > 10.2.6.199 oscar-control.blah.com oscar-control oscar_server 
> > > > > > nfs_oscar pbs_oscar
> > > > > > 172.21.184.192          oscar-control.blah.com oscar-control
> > > > > > 
> > > > > > # These entries are managed by SIS, please don't modify them.
> > > > > > 10.2.6.1             node1.blah.com  node1
> > > > > > 10.2.6.2             node2.blah.com  node2
> > > > > > 10.2.6.3             node3.blah.com  node3
> > > > > > 10.2.6.4             node4.blah.com  node4
> > > > > > 10.2.6.5             node5.blah.com  node5
> > > > > > 10.2.6.6             node6.blah.com  node6
> > > > > > 10.2.6.7             node7.blah.com  node7
> > > > > > 10.2.6.8             node8.blah.com  node8
> > > > > > 10.2.6.9             node9.blah.com  node9
> > > > > > 10.2.6.10            node10.blah.com node10
> > > > > 
> > > > > I don't see a 127.0.0.1...  It needs to be there.
> > > > 
> > > > It's in there...just skipped it on the paste.
> > > > 
> > > > > > The *.err files in
> > > > > > oscartst are zero length.
> > > > > > 
> > > > > > Where would I find more log files?
> > > > > 
> > > > > Since these are PBS/Torque errors, one place to look 
> > for logs is 
> > > > > /var/spool/pbs/server_logs.
> > > > 
> > > > Output from the server_log during the test:
> > > > 
> > > > 04/29/2005
> > > > 13:15:15;0040;PBS_Server;Svr;oscar-control;Scheduler
> > > > sent command time
> > > > 04/29/2005
> > > > 13:16:15;0040;PBS_Server;Svr;oscar-control;Scheduler
> > > > sent command time
> > > > 04/29/2005
> > > > 13:17:15;0040;PBS_Server;Svr;oscar-control;Scheduler
> > > > sent command time
> > > > 04/29/2005
> > > > 13:18:15;0040;PBS_Server;Svr;oscar-control;Scheduler
> > > > sent command time
> > > > 04/29/2005
> > > > 13:19:15;0040;PBS_Server;Svr;oscar-control;Scheduler
> > > > sent command time
> > > > 04/29/2005
> > > > 13:20:15;0040;PBS_Server;Svr;oscar-control;Scheduler
> > > > sent command time
> > > > 04/29/2005
> > > > 13:21:15;0040;PBS_Server;Svr;oscar-control;Scheduler
> > > > sent command time
> > > > 04/29/2005
> > > > 13:22:15;0040;PBS_Server;Svr;oscar-control;Scheduler
> > > > sent command time
> > > > 04/29/2005
> > > > 13:23:15;0040;PBS_Server;Svr;oscar-control;Scheduler
> > > > sent command time
> > > > 04/29/2005
> > > > 13:24:15;0040;PBS_Server;Svr;oscar-control;Scheduler
> > > > sent command time
> > > > 
> > > > 
> > > > 
> > > > 
> > > > 04/29/2005 13:24:23;0002;PBS_Server;Svr;PBS_Server;Server
> > > > shutdown completed
> > > > 04/29/2005 13:24:23;0002;PBS_Server;Svr;Log;Log closed
> > > > 04/29/2005 13:24:23;0002;PBS_Server;Svr;Log;Log opened
> > > > 04/29/2005 13:24:23;0006;PBS_Server;Svr;PBS_Server;Server
> > > > oscar-control started, initialization type = 1
> > > > 04/29/2005 13:24:23;0002;PBS_Server;Svr;Act;Account file
> > > > /var/spool/pbs/server_priv/accounting/20050429 opened
> > > > 04/29/2005 13:24:23;0040;PBS_Server;Req;setup_nodes;setup_nodes()
> > > > 
> > > > 04/29/2005 13:24:23;0002;PBS_Server;Svr;PBS_Server;Server 
> > Ready, pid 
> > > > =
> > > > 4391
> > > > 04/29/2005
> > > > 13:24:23;0040;PBS_Server;Svr;oscar-control;Scheduler
> > > > sent command scheduler_first
> > > > 04/29/2005
> > > > 13:25:23;0040;PBS_Server;Svr;oscar-control;Scheduler
> > > > sent command time
> > > > 04/29/2005
> > > > 13:26:23;0040;PBS_Server;Svr;oscar-control;Scheduler
> > > > sent command time
> > > > 04/29/2005
> > > > 13:27:23;0040;PBS_Server;Svr;oscar-control;Scheduler
> > > > sent command time
> > > > 04/29/2005
> > > > 13:28:23;0040;PBS_Server;Svr;oscar-control;Scheduler
> > > > sent command time
> > > > 04/29/2005
> > > > 13:29:23;0040;PBS_Server;Svr;oscar-control;Scheduler
> > > > sent command time
> > > > 
> > > > I don't see anything usual in this log.  One thing I've 
> > noticed, and 
> > > > perhaps this is just my ignorance on how it functions at 
> > the moment, 
> > > > but my submitted jobs do not appear to get scheduled when 
> > submitted.  
> > > > Just a simple echo script sits in the queue:
> > > > 
> > > > [EMAIL PROTECTED] oscartst]$ qstat
> > > > Job id           Name             User             Time 
> > Use S Queue
> > > > ---------------- ---------------- ---------------- 
> > -------- - -----
> > > > 21.oscar-control   test.sh          oscartst              
> >   0 Q workq
> > > > 22.oscar-control   test.sh          oscartst              
> >   0 Q workq
> > > > 
> > > > 04/29/2005
> > > > 13:36:59;0040;PBS_Server;Svr;oscar-control;Scheduler
> > > > sent command new
> > > > 
> > > > 
> > > > > Cheers,
> > > > > 
> > > > > Bernard
> > > > > 
> > > > > 
> > > > > -------------------------------------------------------
> > > > > This SF.Net email is sponsored by: NEC IT Guy Games.
> > > > > Get your fingers limbered up and give it your best 
> > shot. 4 great 
> > > > > events, 4 opportunities to win big! Highest score 
> > wins.NEC IT Guy 
> > > > > Games. Play to win an NEC 61 plasma display. Visit 
> > > > > http://www.necitguy.com/?r 
> > > > > _______________________________________________
> > > > > Oscar-users mailing list
> > > > > [email protected]
> > > > > https://lists.sourceforge.net/lists/listinfo/oscar-users
> > > > > 
> > > > 
> > > > 
> > > 
> > 
> > 
> 
> 
> -------------------------------------------------------
> This SF.Net email is sponsored by: NEC IT Guy Games.
> Get your fingers limbered up and give it your best shot. 4 great events, 4
> opportunities to win big! Highest score wins.NEC IT Guy Games. Play to
> win an NEC 61 plasma display. Visit http://www.necitguy.com/?r 
> _______________________________________________
> Oscar-users mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/oscar-users
> 



-------------------------------------------------------
This SF.Net email is sponsored by: NEC IT Guy Games.
Get your fingers limbered up and give it your best shot. 4 great events, 4
opportunities to win big! Highest score wins.NEC IT Guy Games. Play to
win an NEC 61 plasma display. Visit http://www.necitguy.com/?r=20
_______________________________________________
Oscar-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/oscar-users

Reply via email to