At 01:33 PM 11/13/2002 -0600, Benjamin Simmons wrote:
I would need to see the script, and also the error you're getting from qsub if it's rejecting submission.PBS that comes with OSCAR.sepbs is a script that launches my code using qsub to launch a script that includes pbsdsh.
Yes- resources. Current PBS in OSCAR isn't configured like that, but will be in the future. In the meantime, run:Quite a circular path, but it appears to work. Is there also a way to assign a particular job to a particular node?
for machine in `cat /var/spool/pbs/server_priv/nodes |awk '{ print $1 }'` ; do qmgr -c "set node $machine properties+=$machine" ; done
That will add a machine name resource to each of your machines, and then you can modify your qsub line to do something like:
qsub -l nodes=1:ppn=2:resource=node5 -N my_job_name my_job_script.sh
Jeremy
Thanks,
Ben Simmons
Jeremy Enos wrote:
> Ben-
> Are you using PBSPro, or the PBS that comes with OSCAR? What does sepbs do?
>
> Jeremy
>
> At 07:19 AM 11/13/2002 -0600, Benjamin Simmons wrote:
> >When I get 3 jobs running, I cannot add a 4th to the queue. It will not
> >acknowledge it. I checked the maui config file, and it has a max number of
> >jobs set to 8. I have copied the output of these commands below.
> >
> >Thanks again for all your help,
> >
> >Ben Simmons
> >
> >
> >[bdsimmns@jupiter cluster]$ qstat
> >Job id Name User Time Use S Queue
> >---------------- ---------------- ---------------- -------- - -----
> >76.jupiter se_V3e-5_M1e-3 bdsimmns 09:31:59 R workq
> >77.jupiter se_V3e-5_M2e-3 bdsimmns 09:32:05 R workq
> >
> >
> >[bdsimmns@jupiter cluster]$ sepbs V3e-5_M3e-3
> >PBS: Now Executing SurfaceEvolver, case= V3e-5_M3e-3
> >82.jupiter.cfd.me.memphis.edu
> >Script completed-Your job is running!
> >
> >[bdsimmns@jupiter cluster]$ qstat
> >Job id Name User Time Use S Queue
> >---------------- ---------------- ---------------- -------- - -----
> >76.jupiter se_V3e-5_M1e-3 bdsimmns 09:31:59 R workq
> >77.jupiter se_V3e-5_M2e-3 bdsimmns 09:32:05 R workq
> >82.jupiter se_V3e-5_M3e-3 bdsimmns 0 R workq
> >
> >[bdsimmns@jupiter cluster]$ sepbs V3e-5_M4e-3
> >PBS: Now Executing SurfaceEvolver, case= V3e-5_M4e-3
> >83.jupiter.cfd.me.memphis.edu
> >Script completed-Your job is running!
> >
> >[bdsimmns@jupiter cluster]$ qstat
> >Job id Name User Time Use S Queue
> >---------------- ---------------- ---------------- -------- - -----
> >76.jupiter se_V3e-5_M1e-3 bdsimmns 09:31:59 R workq
> >77.jupiter se_V3e-5_M2e-3 bdsimmns 09:32:05 R workq
> >82.jupiter se_V3e-5_M3e-3 bdsimmns 0 R workq
> >
> >[bdsimmns@jupiter cluster]$
> >
> >
> >Bruce Becker wrote:
> >
> > > Hi Benjamin
> > >
> > > We have been using OSCAR software to set up our cluster here too. We could
> > > never get the distribution of PBS that gets shipped with OSCAR to work
> > > properly, so we went instead for PBSPro. Since we are an educational
> > > insitution, we pay zero niks nada for it. You guys should investigate the
> > > option as well, I think, because we've had a wonderfully easy time using
> > > PBSPro.
> > >
> > > As for your current problem, I think the first place to start looking
> > > would be your queue definition... try qmgr and do a print queue:
> > >
> > > becker@qgp3:/home/becker>qmgr
> > > Max open servers: 4
> > > Qmgr: print queue workq
> > > #
> > > # Create queues and set their attributes.
> > > #
> > > #
> > > # Create and define queue workq
> > > #
> > > create queue workq
> > > set queue workq queue_type = Execution
> > > set queue workq resources_available.ncpus = 20
> > > set queue workq max_user_run = 17
> > > set queue workq enabled = True
> > > set queue workq started = True
> > >
> > > to see what the default is set to. The other easy way to debug is to use
> > > xpbs to see the details of your queue. If you can submit more than one job
> > > and see it run to completion, but get limited to a number of concurrent
> > > jobs, then it sounds like it's the queue policy that's kicking in. Do the
> > > other jobs you submit get queued at all ? ie - do you see a 'Q' state for
> > > them, when you do a qstat ?
> > > eg
> > >
> > > 524.qgp3 run_swim2a_msel steinber 05:38:03 R workq
> > > 525.qgp3 run_swim2a_msel steinber 05:38:25 R workq
> > > 526.qgp3 run_swim2a_msel steinber 05:38:03 R workq
> > > 527.qgp3 run_swim2a_msel steinber 05:38:27 R workq
> > > 528.qgp3 run_swim2a_msel steinber 0 Q workq
> > > 529.qgp3 run_swim2a_msel steinber 0 Q workq
> > >
> > > hope I was of some help. good luck !
> > >
> > > On Wed, 13 Nov 2002, Benjamin Simmons wrote:
> > >
> > > > I have installed and tested my first cluster using this tool. It is
> > > > great!!!!
> > > >
> > > > When trying to run several jobs via PBS, I have only been able to get 3
> > > > jobs to be processes at one time. Is there a limit to the number of
> > > > concurrent jobs that a single user can run? Is there a way to specify
> > > > the nodes you desire a specific job to run on?
> > > >
> > > > Thanks,
> > > >
> > > > Benjamin Simmons
> > > > The University of Memphis
> > > > Mechanical Engineering Department
> > > >
> > > >
> > > >
> > > > -------------------------------------------------------
> > > > This sf.net email is sponsored by: Are you worried about
> > > > your web server security? Click here for a FREE Thawte
> > > > Apache SSL Guide and answer your Apache SSL security
> > > > needs: http://www.gothawte.com/rd523.html
> > > > _______________________________________________
> > > > Oscar-users mailing list
> > > > [EMAIL PROTECTED]
> > > > https://lists.sourceforge.net/lists/listinfo/oscar-users
> > > >
> > >
> > > Bruce Becker, PhD student - Department of Physics
> > > University of Cape Town
> > >
> > > Room 405, R.W. James Building, UCT
> > > University Avenue North
> > > Private Bag RONDEBOSCH
> > > 7700
> > >
> > > tel (w) +27 21 650 3356
> > > tel (m) +27 82 537 9425
> > > fax +27 21 650 3342
> > >
> > > http://hep.phy.uct.ac.za/~becker
> >
> >
> >
> >-------------------------------------------------------
> >This sf.net email is sponsored by: Are you worried about
> >your web server security? Click here for a FREE Thawte
> >Apache SSL Guide and answer your Apache SSL security
> >needs: http://www.gothawte.com/rd523.html
> >_______________________________________________
> >Oscar-users mailing list
> >[EMAIL PROTECTED]
> >https://lists.sourceforge.net/lists/listinfo/oscar-users
>
> -------------------------------------------------------
> This sf.net email is sponsored by: Are you worried about
> your web server security? Click here for a FREE Thawte
> Apache SSL Guide and answer your Apache SSL security
> needs: http://www.gothawte.com/rd523.html
> _______________________________________________
> Oscar-users mailing list
> [EMAIL PROTECTED]
> https://lists.sourceforge.net/lists/listinfo/oscar-users
-------------------------------------------------------
This sf.net email is sponsored by: Are you worried about your web server security? Click here for a FREE Thawte Apache SSL Guide and answer your Apache SSL security needs: http://www.gothawte.com/rd523.html
_______________________________________________
Oscar-users mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/oscar-users
