Have you tried setting them in the config files themselves?

On 9/18/07, Nilesh Mistry <[EMAIL PROTECTED]> wrote:
> Hello
>
> I was checking the pbs server setting (qmgr -c "list server") and notice
> the following value:
>
>  resources_assigned.ncpus = 2
>  resources_assigned.nodect = 24
>
> Could these 2 values be my problem?  I have been trying to changes these
> as root but getting:
>
> qmgr obj=master svr=master: Cannot set attribute, read only or
> insufficient permission  resources_assigned.nodect*
>
>
> *
>
> Thanks
>
> Nilesh Mistry
> Academic Computing Services
> [EMAIL PROTECTED] & TEL Campus
> Seneca College Of Applies Arts & Technology
> 70 The Pond Road
> Toronto, Ontario
> M3J 3M6 Canada
> Phone 416 491 5050 ext 3788
> Fax 416 661 4695
> http://acs.senecac.on.ca
>
>
>
> Nilesh Mistry wrote:
> > Hello Michael
> >
> > I have tried this as well with the same result.
> >
> > I have even set properties for the quad core nodes and specified them in
> > the pbs script to only select those nodes, but still the same problem.
> >
> > I have created a new queue adding the properties of the quad core nodes
> > to resources_max.nodes = 228:ppn=4:i965, however still the same result.
> >
> > Thanks
> >
> > Nilesh Mistry
> > Academic Computing Services
> > [EMAIL PROTECTED] & TEL Campus
> > Seneca College Of Applies Arts & Technology
> > 70 The Pond Road
> > Toronto, Ontario
> > M3J 3M6 Canada
> > Phone 416 491 5050 ext 3788
> > Fax 416 661 4695
> > http://acs.senecac.on.ca
> >
> >
> >
> > Michael Edwards wrote:
> >
> >> set queue workq resources_max.ncpus = 200
> >> set queue workq resources_max.nodect = 64
> >> set queue workq resources_max.nodes = 200:ppn=4
> >>
> >> This should probably be 50*4 + 14*2 = 228
> >>
> >> set queue workq resources_max.ncpus = 228
> >> set queue workq resources_max.nodect = 64
> >> set queue workq resources_max.nodes = 228:ppn=4
> >>
> >> Though you might want to try making two queues, I don't know how well
> >> torque deals with having different numbers of ppn on different nodes.
> >>
> >> On 9/18/07, Nilesh Mistry <[EMAIL PROTECTED]> wrote:
> >>
> >>
> >>> I the following error after the using qsub
> >>>
> >>> qsub: Job exceeds queue resource limits
> >>>
> >>> If I change #PBS -l nodes=64 to #PBS -l nodes=60 i get the job submitted
> >>> and running and the if fails
> >>>
> >>> ################  qstat -f ############################
> >>>
> >>> Job Id: 924.master.atar.senecac.on.ca
> >>>     Job_Name = scaling_test
> >>>     Job_Owner = [EMAIL PROTECTED]
> >>>     job_state = R
> >>>     queue = parallel
> >>>     server = master.atar.senecac.on.ca
> >>>     Checkpoint = u
> >>>     ctime = Tue Sep 18 09:09:45 2007
> >>>     Error_Path =
> >>> master:/home/faculty/nilesh.mistry/pbs/multi/scaling_test/scal
> >>>         ing_test.err
> >>>     exec_host =
> >>> atarnode59.atar.senecac.on.ca/1+atarnode59.atar.senecac.on.ca/0
> >>>
> >>> +atarnode57.atar.senecac.on.ca/1+atarnode57.atar.senecac.on.ca/0+atarno
> >>>
> >>> de56.atar.senecac.on.ca/1+atarnode56.atar.senecac.on.ca/0+atarnode55.at
> >>>
> >>> ar.senecac.on.ca/1+atarnode55.atar.senecac.on.ca/0+atarnode54.atar.sene
> >>>
> >>> cac.on.ca/1+atarnode54.atar.senecac.on.ca/0+atarnode53.atar.senecac.on.
> >>>
> >>> ca/1+atarnode53.atar.senecac.on.ca/0+atarnode52.atar.senecac.on.ca/1+at
> >>>
> >>> arnode52.atar.senecac.on.ca/0+atarnode51.atar.senecac.on.ca/1+atarnode5
> >>>
> >>> 1.atar.senecac.on.ca/0+atarnode50.atar.senecac.on.ca/2+atarnode50.atar.
> >>>
> >>> senecac.on.ca/1+atarnode50.atar.senecac.on.ca/0+atarnode49.atar.senecac
> >>>
> >>> .on.ca/2+atarnode49.atar.senecac.on.ca/1+atarnode49.atar.senecac.on.ca/
> >>>
> >>> 0+atarnode48.atar.senecac.on.ca/2+atarnode48.atar.senecac.on.ca/1+atarn
> >>>
> >>> ode48.atar.senecac.on.ca/0+atarnode47.atar.senecac.on.ca/2+atarnode47.a
> >>>
> >>> tar.senecac.on.ca/1+atarnode47.atar.senecac.on.ca/0+atarnode45.atar.sen
> >>>
> >>> ecac.on.ca/2+atarnode45.atar.senecac.on.ca/1+atarnode45.atar.senecac.on
> >>>
> >>> .ca/0+atarnode44.atar.senecac.on.ca/2+atarnode44.atar.senecac.on.ca/1+a
> >>>
> >>> tarnode44.atar.senecac.on.ca/0+atarnode42.atar.senecac.on.ca/2+atarnode
> >>>
> >>> 42.atar.senecac.on.ca/1+atarnode42.atar.senecac.on.ca/0+atarnode41.atar
> >>>
> >>> .senecac.on.ca/2+atarnode41.atar.senecac.on.ca/1+atarnode41.atar.seneca
> >>>
> >>> c.on.ca/0+atarnode40.atar.senecac.on.ca/2+atarnode40.atar.senecac.on.ca
> >>>
> >>> /1+atarnode40.atar.senecac.on.ca/0+atarnode39.atar.senecac.on.ca/2+atar
> >>>
> >>> node39.atar.senecac.on.ca/1+atarnode39.atar.senecac.on.ca/0+atarnode38.
> >>>
> >>> atar.senecac.on.ca/2+atarnode38.atar.senecac.on.ca/1+atarnode38.atar.se
> >>>
> >>> necac.on.ca/0+atarnode37.atar.senecac.on.ca/2+atarnode37.atar.senecac.o
> >>>
> >>> n.ca/1+atarnode37.atar.senecac.on.ca/0+atarnode36.atar.senecac.on.ca/2+
> >>>
> >>> atarnode36.atar.senecac.on.ca/1+atarnode36.atar.senecac.on.ca/0+atarnod
> >>>
> >>> e35.atar.senecac.on.ca/2+atarnode35.atar.senecac.on.ca/1+atarnode35.ata
> >>>
> >>> r.senecac.on.ca/0+atarnode34.atar.senecac.on.ca/1+atarnode34.atar.senec
> >>>         ac.on.ca/0
> >>>     Hold_Types = n
> >>>     Join_Path = oe
> >>>     Keep_Files = n
> >>>     Mail_Points = abe
> >>>     Mail_Users = nilesh.mistry
> >>>     mtime = Tue Sep 18 09:09:46 2007
> >>>     Output_Path =
> >>> master:/home/faculty/nilesh.mistry/pbs/multi/scaling_test/sca
> >>>         ling_test.log
> >>>     Priority = 0
> >>>     qtime = Tue Sep 18 09:09:45 2007
> >>>     Rerunable = True
> >>>     Resource_List.cput = 10000:00:00
> >>>     Resource_List.mem = 64000mb
> >>>     Resource_List.ncpus = 1
> >>>     Resource_List.nodect = 60
> >>>     Resource_List.nodes = 60
> >>>     Resource_List.walltime = 10000:00:00
> >>>     Variable_List = PBS_O_HOME=/home/faculty/nilesh.mistry,
> >>>         PBS_O_LANG=en_CA.UTF-8,PBS_O_LOGNAME=nilesh.mistry,
> >>>
> >>> PBS_O_PATH=/usr/kerberos/bin:/opt/lam-7.1.2/bin:/usr/local/bin:/bin:/u
> >>>
> >>> sr/bin:/usr/X11R6/bin:/opt/env-switcher/bin:/opt/pvm3/lib:/opt/pvm3/lib
> >>>
> >>> /LINUX:/opt/pvm3/bin/LINUX:/opt/pbs/bin:/opt/pbs/lib/xpbs/bin:/opt/c3-4
> >>>
> >>> /:/home/faculty/nilesh.mistry/bin:/opt/maui/bin:/usr/lib/news/bin:/home
> >>>         /faculty/nilesh.mistry/scripts,
> >>>         PBS_O_MAIL=/var/spool/mail/nilesh.mistry,PBS_O_SHELL=/bin/bash,
> >>>         PBS_O_HOST=master.atar.senecac.on.ca,
> >>>         PBS_O_WORKDIR=/home/faculty/nilesh.mistry/pbs/multi/scaling_test,
> >>>         PBS_O_QUEUE=parallel
> >>>     etime = Tue Sep 18 09:09:45 2007
> >>>
> >>> ###################### Log file ##############################
> >>>
> >>> ------------------------------------------------------
> >>>  This job is allocated on 60 cpu(s)
> >>> Job is running on node(s):
> >>> atarnode59.atar.senecac.on.ca
> >>> atarnode59.atar.senecac.on.ca
> >>> atarnode57.atar.senecac.on.ca
> >>> atarnode57.atar.senecac.on.ca
> >>> atarnode56.atar.senecac.on.ca
> >>> atarnode56.atar.senecac.on.ca
> >>> atarnode55.atar.senecac.on.ca
> >>> atarnode55.atar.senecac.on.ca
> >>> atarnode54.atar.senecac.on.ca
> >>> atarnode54.atar.senecac.on.ca
> >>> atarnode53.atar.senecac.on.ca
> >>> atarnode53.atar.senecac.on.ca
> >>> atarnode52.atar.senecac.on.ca
> >>> atarnode52.atar.senecac.on.ca
> >>> atarnode51.atar.senecac.on.ca
> >>> atarnode51.atar.senecac.on.ca
> >>> atarnode50.atar.senecac.on.ca
> >>> atarnode50.atar.senecac.on.ca
> >>> atarnode50.atar.senecac.on.ca
> >>> atarnode49.atar.senecac.on.ca
> >>> atarnode49.atar.senecac.on.ca
> >>> atarnode49.atar.senecac.on.ca
> >>> atarnode48.atar.senecac.on.ca
> >>> atarnode48.atar.senecac.on.ca
> >>> atarnode48.atar.senecac.on.ca
> >>> atarnode47.atar.senecac.on.ca
> >>> atarnode47.atar.senecac.on.ca
> >>> atarnode47.atar.senecac.on.ca
> >>> atarnode45.atar.senecac.on.ca
> >>> atarnode45.atar.senecac.on.ca
> >>> atarnode45.atar.senecac.on.ca
> >>> atarnode44.atar.senecac.on.ca
> >>> atarnode44.atar.senecac.on.ca
> >>> atarnode44.atar.senecac.on.ca
> >>> atarnode42.atar.senecac.on.ca
> >>> atarnode42.atar.senecac.on.ca
> >>> atarnode42.atar.senecac.on.ca
> >>> atarnode41.atar.senecac.on.ca
> >>> atarnode41.atar.senecac.on.ca
> >>> atarnode41.atar.senecac.on.ca
> >>> atarnode40.atar.senecac.on.ca
> >>> atarnode40.atar.senecac.on.ca
> >>> atarnode40.atar.senecac.on.ca
> >>> atarnode39.atar.senecac.on.ca
> >>> atarnode39.atar.senecac.on.ca
> >>> atarnode39.atar.senecac.on.ca
> >>> atarnode38.atar.senecac.on.ca
> >>> atarnode38.atar.senecac.on.ca
> >>> atarnode38.atar.senecac.on.ca
> >>> atarnode37.atar.senecac.on.ca
> >>> atarnode37.atar.senecac.on.ca
> >>> atarnode37.atar.senecac.on.ca
> >>> atarnode36.atar.senecac.on.ca
> >>> atarnode36.atar.senecac.on.ca
> >>> atarnode36.atar.senecac.on.ca
> >>> atarnode35.atar.senecac.on.ca
> >>> atarnode35.atar.senecac.on.ca
> >>> atarnode35.atar.senecac.on.ca
> >>> atarnode34.atar.senecac.on.ca
> >>> atarnode34.atar.senecac.on.ca
> >>> PBS: qsub is running on master.atar.senecac.on.ca
> >>> PBS: originating queue is parallel
> >>> PBS: executing queue is parallel
> >>> PBS: working directory is 
> >>> /home/faculty/nilesh.mistry/pbs/multi/scaling_test
> >>> PBS: execution mode is PBS_BATCH
> >>> PBS: job identifier is 924.master.atar.senecac.on.ca
> >>> PBS: job name is scaling_test
> >>> PBS: node file is /var/spool/pbs/aux//924.master.atar.senecac.on.ca
> >>> PBS: current home directory is /home/faculty/nilesh.mistry
> >>> PBS: PATH =
> >>> /usr/kerberos/bin:/opt/lam-7.1.2/bin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin:/opt/env-switcher/bin:/opt/pvm3/lib:/opt/pvm3/lib/LINUX:/opt/pvm3/bi
> >>> n/LINUX:/opt/pbs/bin:/opt/pbs/lib/xpbs/bin:/opt/c3-4/:/home/faculty/nilesh.mistry/bin:/opt/maui/bin:/usr/lib/news/bin:/home/faculty/nilesh.mistry/scripts
> >>> ------------------------------------------------------
> >>> Mesh  1 of 60 is alive on atarnode59.atar.senecac.on.ca
> >>> Mesh 17 of 60 is alive on atarnode50.atar.senecac.on.ca
> >>> Mesh 18 of 60 is alive on atarnode50.atar.senecac.on.ca
> >>> Mesh  3 of 60 is alive on atarnode57.atar.senecac.on.ca
> >>> Mesh 50 of 60 is alive on atarnode37.atar.senecac.on.ca
> >>> Mesh 51 of 60 is alive on atarnode37.atar.senecac.on.ca
> >>> Mesh 58 of 60 is alive on atarnode35.atar.senecac.on.ca
> >>> Mesh 15 of 60 is alive on atarnode51.atar.senecac.on.ca
> >>> Mesh 56 of 60 is alive on atarnode35.atar.senecac.on.ca
> >>> Mesh 47 of 60 is alive on atarnode38.atar.senecac.on.ca
> >>> Mesh 41 of 60 is alive on atarnode40.atar.senecac.on.ca
> >>> Mesh 43 of 60 is alive on atarnode40.atar.senecac.on.ca
> >>> Mesh 23 of 60 is alive on atarnode48.atar.senecac.on.ca
> >>> Mesh 59 of 60 is alive on atarnode34.atar.senecac.on.ca
> >>> Mesh 44 of 60 is alive on atarnode39.atar.senecac.on.ca
> >>> Mesh 60 of 60 is alive on atarnode34.atar.senecac.on.ca
> >>> Mesh 26 of 60 is alive on atarnode47.atar.senecac.on.ca
> >>> Mesh 46 of 60 is alive on atarnode39.atar.senecac.on.ca
> >>> Mesh 42 of 60 is alive on atarnode40.atar.senecac.on.ca
> >>> Mesh 32 of 60 is alive on atarnode44.atar.senecac.on.ca
> >>> Mesh 20 of 60 is alive on atarnode49.atar.senecac.on.ca
> >>> Mesh 35 of 60 is alive on atarnode42.atar.senecac.on.ca
> >>> Mesh 53 of 60 is alive on atarnode36.atar.senecac.on.ca
> >>> Mesh 22 of 60 is alive on atarnode49.atar.senecac.on.ca
> >>> Mesh 19 of 60 is alive on atarnode50.atar.senecac.on.ca
> >>> Mesh 48 of 60 is alive on atarnode38.atar.senecac.on.ca
> >>> Mesh 37 of 60 is alive on atarnode42.atar.senecac.on.ca
> >>> Mesh 54 of 60 is alive on atarnode36.atar.senecac.on.ca
> >>> Mesh 55 of 60 is alive on atarnode36.atar.senecac.on.ca
> >>> Mesh 45 of 60 is alive on atarnode39.atar.senecac.on.ca
> >>> Mesh 29 of 60 is alive on atarnode45.atar.senecac.on.ca
> >>> Mesh 24 of 60 is alive on atarnode48.atar.senecac.on.ca
> >>> Mesh 30 of 60 is alive on atarnode45.atar.senecac.on.ca
> >>> Mesh 31 of 60 is alive on atarnode45.atar.senecac.on.ca
> >>> Mesh 52 of 60 is alive on atarnode37.atar.senecac.on.ca
> >>> Mesh 28 of 60 is alive on atarnode47.atar.senecac.on.ca
> >>> Mesh 36 of 60 is alive on atarnode42.atar.senecac.on.ca
> >>> Mesh 34 of 60 is alive on atarnode44.atar.senecac.on.ca
> >>> Mesh 38 of 60 is alive on atarnode41.atar.senecac.on.ca
> >>> Mesh 40 of 60 is alive on atarnode41.atar.senecac.on.ca
> >>> Mesh  5 of 60 is alive on atarnode56.atar.senecac.on.ca
> >>> Mesh 57 of 60 is alive on atarnode35.atar.senecac.on.ca
> >>> Mesh 13 of 60 is alive on atarnode52.atar.senecac.on.ca
> >>> Mesh  9 of 60 is alive on atarnode54.atar.senecac.on.ca
> >>> Mesh 39 of 60 is alive on atarnode41.atar.senecac.on.ca
> >>> Mesh  7 of 60 is alive on atarnode55.atar.senecac.on.ca
> >>> Mesh 10 of 60 is alive on atarnode54.atar.senecac.on.ca
> >>> Mesh  8 of 60 is alive on atarnode55.atar.senecac.on.ca
> >>> Mesh  4 of 60 is alive on atarnode57.atar.senecac.on.ca
> >>> Mesh  6 of 60 is alive on atarnode56.atar.senecac.on.ca
> >>> Mesh 11 of 60 is alive on atarnode53.atar.senecac.on.ca
> >>> Mesh 14 of 60 is alive on atarnode52.atar.senecac.on.ca
> >>> Mesh 12 of 60 is alive on atarnode53.atar.senecac.on.ca
> >>> Mesh 21 of 60 is alive on atarnode49.atar.senecac.on.ca
> >>> Mesh 16 of 60 is alive on atarnode51.atar.senecac.on.ca
> >>> Mesh 33 of 60 is alive on atarnode44.atar.senecac.on.ca
> >>> Mesh 49 of 60 is alive on atarnode38.atar.senecac.on.ca
> >>> Mesh 25 of 60 is alive on atarnode48.atar.senecac.on.ca
> >>> Mesh 27 of 60 is alive on atarnode47.atar.senecac.on.ca
> >>> Mesh  2 of 60 is alive on atarnode59.atar.senecac.on.ca
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>>
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>> ERROR: Number of meshes not equal to number of threads
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>>
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>> ERROR: Number of meshes not equal to number of threads
> >>> ERROR: Number of meshes not equal to number of threads
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> ERROR: Number of meshes not equal to number of threads
> >>>
> >>> LAM 7.1.2/MPI 2 C++/ROMIO - Indiana University
> >>>
> >>>
> >>> Thanks
> >>>
> >>> Nilesh Mistry
> >>> Academic Computing Services
> >>> [EMAIL PROTECTED] & TEL Campus
> >>> Seneca College Of Applies Arts & Technology
> >>> 70 The Pond Road
> >>> Toronto, Ontario
> >>> M3J 3M6 Canada
> >>> Phone 416 491 5050 ext 3788
> >>> Fax 416 661 4695
> >>> http://acs.senecac.on.ca
> >>>
> >>>
> >>>
> >>> Michael Edwards wrote:
> >>>
> >>>
> >>>> What do you get when you do "qstat -f" on the job?  How many nodes is
> >>>> it actually getting?
> >>>>
> >>>> On 9/18/07, Nilesh Mistry <[EMAIL PROTECTED]> wrote:
> >>>>
> >>>>
> >>>>
> >>>>> Micheal
> >>>>>
> >>>>> We have actually moved to a larger cluster of 64 nodes (50 quad core and
> >>>>> 14 dual opterons), there fore 220 processors available.  We are
> >>>>> submitting a job that requires 64 threads but still with the same
> >>>>> result.  Here are the files you requested.  I have already submitted to
> >>>>> torque user list.
> >>>>>
> >>>>> ####### PBS SCRIPT START#######
> >>>>>
> >>>>> #!/bin/sh -f
> >>>>> #PBS -l nodes=64
> >>>>> #PBS -N scaling_test
> >>>>> #PBS -e scaling_test.err
> >>>>> #PBS -o scaling_test.log
> >>>>> #PBS -j oe
> >>>>> #PBS -l mem=64000mb
> >>>>> #PBS -m abe
> >>>>> #PBS -q parallel
> >>>>>
> >>>>> NCPU=`wc -l < $PBS_NODEFILE`
> >>>>> echo ------------------------------------------------------
> >>>>> echo ' This job is allocated on '${NCPU}' cpu(s)'
> >>>>> echo 'Job is running on node(s): '
> >>>>> cat $PBS_NODEFILE
> >>>>> echo PBS: qsub is running on $PBS_O_HOST
> >>>>> echo PBS: originating queue is $PBS_O_QUEUE
> >>>>> echo PBS: executing queue is $PBS_QUEUE
> >>>>> echo PBS: working directory is $PBS_O_WORKDIR
> >>>>> echo PBS: execution mode is $PBS_ENVIRONMENT
> >>>>> echo PBS: job identifier is $PBS_JOBID
> >>>>> echo PBS: job name is $PBS_JOBNAME
> >>>>> echo PBS: node file is $PBS_NODEFILE
> >>>>> echo PBS: current home directory is $PBS_O_HOME
> >>>>> echo PBS: PATH = $PBS_O_PATH
> >>>>> echo ------------------------------------------------------
> >>>>> SERVER=$PBS_O_HOST
> >>>>> WORKDIR=$HOME/pbs/multi/scaling_test
> >>>>> cd ${WORKDIR}
> >>>>> cat $PBS_NODEFILE > nodes.list
> >>>>> lamboot -s -H $PBS_NODEFILE
> >>>>> mpirun -np $NCPU /opt/fds/fds5_mpi scaling_test.fds
> >>>>> lamhalt
> >>>>>
> >>>>> ####### PBS SCRIPT END #######
> >>>>>
> >>>>> ####### MAUI.CFG START #######
> >>>>> # maui.cfg 3.2.6p14
> >>>>>
> >>>>> SERVERHOST              master.atar.senecac.on.ca
> >>>>> # primary admin must be first in list
> >>>>> ADMIN1                  root
> >>>>> ADMIN3                  nilesh.mistry
> >>>>>
> >>>>>
> >>>>> # Resource Manager Definition
> >>>>>
> >>>>> RMCFG[master.atar.senecac.on.ca] TYPE=PBS
> >>>>>
> >>>>> # Allocation Manager Definition
> >>>>>
> >>>>> AMCFG[bank]  TYPE=NONE
> >>>>>
> >>>>> # full parameter docs at
> >>>>> http://clusterresources.com/mauidocs/a.fparameters.html
> >>>>> # use the 'schedctl -l' command to display current configuration
> >>>>>
> >>>>> RMPOLLINTERVAL  00:01:00
> >>>>>
> >>>>> SERVERPORT            42559
> >>>>> SERVERMODE            NORMAL
> >>>>>
> >>>>> # Admin: http://clusterresources.com/mauidocs/a.esecurity.html
> >>>>>
> >>>>>
> >>>>> LOGFILE               maui.log
> >>>>> LOGFILEMAXSIZE        10000000
> >>>>> LOGLEVEL              4
> >>>>> LOGFACILITY             fALL
> >>>>>
> >>>>> # Job Priority:
> >>>>> http://clusterresources.com/mauidocs/5.1jobprioritization.html
> >>>>>
> >>>>> QUEUETIMEWEIGHT       1
> >>>>>
> >>>>> # FairShare: http://clusterresources.com/mauidocs/6.3fairshare.html
> >>>>>
> >>>>> #FSPOLICY              PSDEDICATED
> >>>>> #FSDEPTH               7
> >>>>> #FSINTERVAL            86400
> >>>>> #FSDECAY               0.80
> >>>>>
> >>>>> # Throttling Policies:
> >>>>> http://clusterresources.com/mauidocs/6.2throttlingpolicies.html
> >>>>>
> >>>>> # NONE SPECIFIED
> >>>>>
> >>>>> # Backfill: http://clusterresources.com/mauidocs/8.2backfill.html
> >>>>>
> >>>>> BACKFILLPOLICY  ON
> >>>>> RESERVATIONPOLICY     CURRENTHIGHEST
> >>>>>
> >>>>> # the following are modified/added by Mehrdad 13 Sept 07
> >>>>> #NODEACCESSPOLICY       DEDICATED
> >>>>> NODEACCESSPOLICY        SHARED
> >>>>> JOBNODEMATCHPOLICY   EXACTPROC
> >>>>>
> >>>>> # Node Allocation:
> >>>>> http://clusterresources.com/mauidocs/5.2nodeallocation.html
> >>>>>
> >>>>> NODEALLOCATIONPOLICY  MINRESOURCE
> >>>>>
> >>>>> # QOS: http://clusterresources.com/mauidocs/7.3qos.html
> >>>>>
> >>>>> # QOSCFG[hi]  PRIORITY=100 XFTARGET=100 FLAGS=PREEMPTOR:IGNMAXJOB
> >>>>> # QOSCFG[low] PRIORITY=-1000 FLAGS=PREEMPTEE
> >>>>>
> >>>>> # Standing Reservations:
> >>>>> http://clusterresources.com/mauidocs/7.1.3standingreservations.html
> >>>>>
> >>>>> # SRSTARTTIME[test] 8:00:00
> >>>>> # SRENDTIME[test]   17:00:00
> >>>>> # SRDAYS[test]      MON TUE WED THU FRI
> >>>>> # SRTASKCOUNT[test] 20
> >>>>> # SRMAXTIME[test]   0:30:00
> >>>>>
> >>>>> # Creds: http://clusterresources.com/mauidocs/6.1fairnessoverview.html
> >>>>>
> >>>>> # USERCFG[DEFAULT]      FSTARGET=25.0
> >>>>> # USERCFG[john]         PRIORITY=100  FSTARGET=10.0-
> >>>>> # GROUPCFG[staff]       PRIORITY=1000 QLIST=hi:low QDEF=hi
> >>>>> # CLASSCFG[batch]       FLAGS=PREEMPTEE
> >>>>> # CLASSCFG[interactive] FLAGS=PREEMPTOR
> >>>>> USERCFG[DEFAULT]        MAXJOB=4
> >>>>> ####### MAUI.CFG  END #######
> >>>>>
> >>>>> ####### QMGR -c "PRINT SERVER MASTER" ########
> >>>>> #
> >>>>> # Create queues and set their attributes.
> >>>>> #
> >>>>> #
> >>>>> # Create and define queue serial
> >>>>> #
> >>>>> create queue serial
> >>>>> set queue serial queue_type = Execution
> >>>>> set queue serial resources_max.cput = 1000:00:00
> >>>>> set queue serial resources_max.mem = 3000mb
> >>>>> set queue serial resources_max.ncpus = 1
> >>>>> set queue serial resources_max.nodect = 1
> >>>>> set queue serial resources_max.nodes = 1:ppn=1
> >>>>> set queue serial resources_max.walltime = 1000:00:00
> >>>>> set queue serial resources_default.cput = 336:00:00
> >>>>> set queue serial resources_default.mem = 900mb
> >>>>> set queue serial resources_default.ncpus = 1
> >>>>> set queue serial resources_default.nodect = 1
> >>>>> set queue serial resources_default.nodes = 1:ppn=1
> >>>>> set queue serial enabled = True
> >>>>> set queue serial started = True
> >>>>> #
> >>>>> # Create and define queue workq
> >>>>> #
> >>>>> create queue workq
> >>>>> set queue workq queue_type = Execution
> >>>>> set queue workq resources_max.cput = 10000:00:00
> >>>>> set queue workq resources_max.ncpus = 200
> >>>>> set queue workq resources_max.nodect = 64
> >>>>> set queue workq resources_max.nodes = 200:ppn=4
> >>>>> set queue workq resources_max.walltime = 10000:00:00
> >>>>> set queue workq resources_min.cput = 00:00:01
> >>>>> set queue workq resources_min.ncpus = 1
> >>>>> set queue workq resources_min.nodect = 1
> >>>>> set queue workq resources_min.walltime = 00:00:01
> >>>>> set queue workq resources_default.cput = 10000:00:00
> >>>>> set queue workq resources_default.nodect = 1
> >>>>> set queue workq resources_default.walltime = 10000:00:00
> >>>>> set queue workq enabled = True
> >>>>> set queue workq started = True
> >>>>> #
> >>>>> # Create and define queue parallel
> >>>>> #
> >>>>> create queue parallel
> >>>>> set queue parallel queue_type = Execution
> >>>>> set queue parallel resources_max.cput = 10000:00:00
> >>>>> set queue parallel resources_max.ncpus = 200
> >>>>> set queue parallel resources_max.nodect = 64
> >>>>> set queue parallel resources_max.nodes = 200:ppn=4
> >>>>> set queue parallel resources_max.walltime = 10000:00:00
> >>>>> set queue parallel resources_min.ncpus = 1
> >>>>> set queue parallel resources_min.nodect = 1
> >>>>> set queue parallel resources_default.ncpus = 1
> >>>>> set queue parallel resources_default.nodect = 1
> >>>>> set queue parallel resources_default.nodes = 1:ppn=1
> >>>>> set queue parallel resources_default.walltime = 10000:00:00
> >>>>> set queue parallel enabled = True
> >>>>> set queue parallel started = True
> >>>>> #
> >>>>> # Set server attributes.
> >>>>> #
> >>>>> set server scheduling = True
> >>>>> set server acl_host_enable = False
> >>>>> set server acl_user_enable = False
> >>>>> set server default_queue = serial
> >>>>> set server log_events = 127
> >>>>> set server mail_from = adm
> >>>>> set server query_other_jobs = True
> >>>>> set server resources_available.ncpus = 200
> >>>>> set server resources_available.nodect = 64
> >>>>> set server resources_available.nodes = 200
> >>>>> set server resources_default.neednodes = 1
> >>>>> set server resources_default.nodect = 1
> >>>>> set server resources_default.nodes = 1
> >>>>> set server resources_max.ncpus = 200
> >>>>> set server resources_max.nodes = 200
> >>>>> set server scheduler_iteration = 60
> >>>>> set server node_check_rate = 150
> >>>>> set server tcp_timeout = 6
> >>>>> set server default_node = 1
> >>>>> set server pbs_version = 2.0.0p8
> >>>>>
> >>>>>
> >>>>>
> >>>>> Thanks
> >>>>>
> >>>>> Nilesh Mistry
> >>>>> Academic Computing Services
> >>>>> [EMAIL PROTECTED] & TEL Campus
> >>>>> Seneca College Of Applies Arts & Technology
> >>>>> 70 The Pond Road
> >>>>> Toronto, Ontario
> >>>>> M3J 3M6 Canada
> >>>>> Phone 416 491 5050 ext 3788
> >>>>> Fax 416 661 4695
> >>>>> http://acs.senecac.on.ca
> >>>>>
> >>>>>
> >>>>>
> >>>>> Michael Edwards wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>>> We'd need your script and the qsub command you used, possibly more
> >>>>>> configuration information from maui and torque, to be much help.
> >>>>>>
> >>>>>> I don't know that we have anyone who is deep with maui or torque right
> >>>>>> now, you might also want to ask on the maui or torque lists.
> >>>>>>
> >>>>>> >From the other posts you have made this error seems to be one of those
> >>>>>> general "Something is Broken" messages that could have many causes.
> >>>>>>
> >>>>>> On 9/17/07, Nilesh Mistry <[EMAIL PROTECTED]> wrote:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>> Hello
> >>>>>>>
> >>>>>>> I am having problems submitting job that requires 23 threads.  I keep
> >>>>>>> getting the following error:
> >>>>>>>
> >>>>>>> ERROR: Number of meshes not equal to number of thread
> >>>>>>>
> >>>>>>> Hardware:
> >>>>>>> 10 quad core nodes (therefore 40 processors available)
> >>>>>>>
> >>>>>>> What do I need to insure in my job queue (qmgr) , maui (maui.cfg) and
> >>>>>>> my submit script when using qsub?
> >>>>>>>
> >>>>>>> Any and all help is greatly appreciated.
> >>>>>>>
> >>>>>>> --
> >>>>>>> Thanks
> >>>>>>>
> >>>>>>> Nilesh Mistry
> >>>>>>> Academic Computing Services
> >>>>>>> [EMAIL PROTECTED] & TEL Campus
> >>>>>>> Seneca College Of Applies Arts & Technology
> >>>>>>> 70 The Pond Road
> >>>>>>> Toronto, Ontario
> >>>>>>> M3J 3M6 Canada
> >>>>>>> Phone 416 491 5050 ext 3788
> >>>>>>> Fax 416 661 4695
> >>>>>>> http://acs.senecac.on.ca
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> -------------------------------------------------------------------------
> >>>>>>> This SF.net email is sponsored by: Microsoft
> >>>>>>> Defy all challenges. Microsoft(R) Visual Studio 2005.
> >>>>>>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> >>>>>>> _______________________________________________
> >>>>>>> Oscar-users mailing list
> >>>>>>> Oscar-users@lists.sourceforge.net
> >>>>>>> https://lists.sourceforge.net/lists/listinfo/oscar-users
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>> -------------------------------------------------------------------------
> >>>>>> This SF.net email is sponsored by: Microsoft
> >>>>>> Defy all challenges. Microsoft(R) Visual Studio 2005.
> >>>>>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> >>>>>> _______________________________________________
> >>>>>> Oscar-users mailing list
> >>>>>> Oscar-users@lists.sourceforge.net
> >>>>>> https://lists.sourceforge.net/lists/listinfo/oscar-users
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>> -------------------------------------------------------------------------
> >>>>> This SF.net email is sponsored by: Microsoft
> >>>>> Defy all challenges. Microsoft(R) Visual Studio 2005.
> >>>>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> >>>>> _______________________________________________
> >>>>> Oscar-users mailing list
> >>>>> Oscar-users@lists.sourceforge.net
> >>>>> https://lists.sourceforge.net/lists/listinfo/oscar-users
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>> -------------------------------------------------------------------------
> >>>> This SF.net email is sponsored by: Microsoft
> >>>> Defy all challenges. Microsoft(R) Visual Studio 2005.
> >>>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> >>>> _______________________________________________
> >>>> Oscar-users mailing list
> >>>> Oscar-users@lists.sourceforge.net
> >>>> https://lists.sourceforge.net/lists/listinfo/oscar-users
> >>>>
> >>>>
> >>>>
> >>> -------------------------------------------------------------------------
> >>> This SF.net email is sponsored by: Microsoft
> >>> Defy all challenges. Microsoft(R) Visual Studio 2005.
> >>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> >>> _______________________________________________
> >>> Oscar-users mailing list
> >>> Oscar-users@lists.sourceforge.net
> >>> https://lists.sourceforge.net/lists/listinfo/oscar-users
> >>>
> >>>
> >>>
> >> -------------------------------------------------------------------------
> >> This SF.net email is sponsored by: Microsoft
> >> Defy all challenges. Microsoft(R) Visual Studio 2005.
> >> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> >> _______________________________________________
> >> Oscar-users mailing list
> >> Oscar-users@lists.sourceforge.net
> >> https://lists.sourceforge.net/lists/listinfo/oscar-users
> >>
> >>
> >
> >
> > -------------------------------------------------------------------------
> > This SF.net email is sponsored by: Microsoft
> > Defy all challenges. Microsoft(R) Visual Studio 2005.
> > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> > _______________________________________________
> > Oscar-users mailing list
> > Oscar-users@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/oscar-users
> >
>
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Microsoft
> Defy all challenges. Microsoft(R) Visual Studio 2005.
> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> _______________________________________________
> Oscar-users mailing list
> Oscar-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/oscar-users
>

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Oscar-users mailing list
Oscar-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/oscar-users

Reply via email to