I was doing some further research and found that resources_assigned.ncpus and resources_assigned.nodect are live time statistics of resources being used, therefore these values can't be changed. When I submitted jobs that I know run, these values changed, increased as resources were used.
Thanks Nilesh Mistry Academic Computing Services [EMAIL PROTECTED] & TEL Campus Seneca College Of Applies Arts & Technology 70 The Pond Road Toronto, Ontario M3J 3M6 Canada Phone 416 491 5050 ext 3788 Fax 416 661 4695 http://acs.senecac.on.ca Michael Edwards wrote: > Have you tried setting them in the config files themselves? > > On 9/18/07, Nilesh Mistry <[EMAIL PROTECTED]> wrote: > >> Hello >> >> I was checking the pbs server setting (qmgr -c "list server") and notice >> the following value: >> >> resources_assigned.ncpus = 2 >> resources_assigned.nodect = 24 >> >> Could these 2 values be my problem? I have been trying to changes these >> as root but getting: >> >> qmgr obj=master svr=master: Cannot set attribute, read only or >> insufficient permission resources_assigned.nodect* >> >> >> * >> >> Thanks >> >> Nilesh Mistry >> Academic Computing Services >> [EMAIL PROTECTED] & TEL Campus >> Seneca College Of Applies Arts & Technology >> 70 The Pond Road >> Toronto, Ontario >> M3J 3M6 Canada >> Phone 416 491 5050 ext 3788 >> Fax 416 661 4695 >> http://acs.senecac.on.ca >> >> >> >> Nilesh Mistry wrote: >> >>> Hello Michael >>> >>> I have tried this as well with the same result. >>> >>> I have even set properties for the quad core nodes and specified them in >>> the pbs script to only select those nodes, but still the same problem. >>> >>> I have created a new queue adding the properties of the quad core nodes >>> to resources_max.nodes = 228:ppn=4:i965, however still the same result. >>> >>> Thanks >>> >>> Nilesh Mistry >>> Academic Computing Services >>> [EMAIL PROTECTED] & TEL Campus >>> Seneca College Of Applies Arts & Technology >>> 70 The Pond Road >>> Toronto, Ontario >>> M3J 3M6 Canada >>> Phone 416 491 5050 ext 3788 >>> Fax 416 661 4695 >>> http://acs.senecac.on.ca >>> >>> >>> >>> Michael Edwards wrote: >>> >>> >>>> set queue workq resources_max.ncpus = 200 >>>> set queue workq resources_max.nodect = 64 >>>> set queue workq resources_max.nodes = 200:ppn=4 >>>> >>>> This should probably be 50*4 + 14*2 = 228 >>>> >>>> set queue workq resources_max.ncpus = 228 >>>> set queue workq resources_max.nodect = 64 >>>> set queue workq resources_max.nodes = 228:ppn=4 >>>> >>>> Though you might want to try making two queues, I don't know how well >>>> torque deals with having different numbers of ppn on different nodes. >>>> >>>> On 9/18/07, Nilesh Mistry <[EMAIL PROTECTED]> wrote: >>>> >>>> >>>> >>>>> I the following error after the using qsub >>>>> >>>>> qsub: Job exceeds queue resource limits >>>>> >>>>> If I change #PBS -l nodes=64 to #PBS -l nodes=60 i get the job submitted >>>>> and running and the if fails >>>>> >>>>> ################ qstat -f ############################ >>>>> >>>>> Job Id: 924.master.atar.senecac.on.ca >>>>> Job_Name = scaling_test >>>>> Job_Owner = [EMAIL PROTECTED] >>>>> job_state = R >>>>> queue = parallel >>>>> server = master.atar.senecac.on.ca >>>>> Checkpoint = u >>>>> ctime = Tue Sep 18 09:09:45 2007 >>>>> Error_Path = >>>>> master:/home/faculty/nilesh.mistry/pbs/multi/scaling_test/scal >>>>> ing_test.err >>>>> exec_host = >>>>> atarnode59.atar.senecac.on.ca/1+atarnode59.atar.senecac.on.ca/0 >>>>> >>>>> +atarnode57.atar.senecac.on.ca/1+atarnode57.atar.senecac.on.ca/0+atarno >>>>> >>>>> de56.atar.senecac.on.ca/1+atarnode56.atar.senecac.on.ca/0+atarnode55.at >>>>> >>>>> ar.senecac.on.ca/1+atarnode55.atar.senecac.on.ca/0+atarnode54.atar.sene >>>>> >>>>> cac.on.ca/1+atarnode54.atar.senecac.on.ca/0+atarnode53.atar.senecac.on. >>>>> >>>>> ca/1+atarnode53.atar.senecac.on.ca/0+atarnode52.atar.senecac.on.ca/1+at >>>>> >>>>> arnode52.atar.senecac.on.ca/0+atarnode51.atar.senecac.on.ca/1+atarnode5 >>>>> >>>>> 1.atar.senecac.on.ca/0+atarnode50.atar.senecac.on.ca/2+atarnode50.atar. >>>>> >>>>> senecac.on.ca/1+atarnode50.atar.senecac.on.ca/0+atarnode49.atar.senecac >>>>> >>>>> .on.ca/2+atarnode49.atar.senecac.on.ca/1+atarnode49.atar.senecac.on.ca/ >>>>> >>>>> 0+atarnode48.atar.senecac.on.ca/2+atarnode48.atar.senecac.on.ca/1+atarn >>>>> >>>>> ode48.atar.senecac.on.ca/0+atarnode47.atar.senecac.on.ca/2+atarnode47.a >>>>> >>>>> tar.senecac.on.ca/1+atarnode47.atar.senecac.on.ca/0+atarnode45.atar.sen >>>>> >>>>> ecac.on.ca/2+atarnode45.atar.senecac.on.ca/1+atarnode45.atar.senecac.on >>>>> >>>>> .ca/0+atarnode44.atar.senecac.on.ca/2+atarnode44.atar.senecac.on.ca/1+a >>>>> >>>>> tarnode44.atar.senecac.on.ca/0+atarnode42.atar.senecac.on.ca/2+atarnode >>>>> >>>>> 42.atar.senecac.on.ca/1+atarnode42.atar.senecac.on.ca/0+atarnode41.atar >>>>> >>>>> .senecac.on.ca/2+atarnode41.atar.senecac.on.ca/1+atarnode41.atar.seneca >>>>> >>>>> c.on.ca/0+atarnode40.atar.senecac.on.ca/2+atarnode40.atar.senecac.on.ca >>>>> >>>>> /1+atarnode40.atar.senecac.on.ca/0+atarnode39.atar.senecac.on.ca/2+atar >>>>> >>>>> node39.atar.senecac.on.ca/1+atarnode39.atar.senecac.on.ca/0+atarnode38. >>>>> >>>>> atar.senecac.on.ca/2+atarnode38.atar.senecac.on.ca/1+atarnode38.atar.se >>>>> >>>>> necac.on.ca/0+atarnode37.atar.senecac.on.ca/2+atarnode37.atar.senecac.o >>>>> >>>>> n.ca/1+atarnode37.atar.senecac.on.ca/0+atarnode36.atar.senecac.on.ca/2+ >>>>> >>>>> atarnode36.atar.senecac.on.ca/1+atarnode36.atar.senecac.on.ca/0+atarnod >>>>> >>>>> e35.atar.senecac.on.ca/2+atarnode35.atar.senecac.on.ca/1+atarnode35.ata >>>>> >>>>> r.senecac.on.ca/0+atarnode34.atar.senecac.on.ca/1+atarnode34.atar.senec >>>>> ac.on.ca/0 >>>>> Hold_Types = n >>>>> Join_Path = oe >>>>> Keep_Files = n >>>>> Mail_Points = abe >>>>> Mail_Users = nilesh.mistry >>>>> mtime = Tue Sep 18 09:09:46 2007 >>>>> Output_Path = >>>>> master:/home/faculty/nilesh.mistry/pbs/multi/scaling_test/sca >>>>> ling_test.log >>>>> Priority = 0 >>>>> qtime = Tue Sep 18 09:09:45 2007 >>>>> Rerunable = True >>>>> Resource_List.cput = 10000:00:00 >>>>> Resource_List.mem = 64000mb >>>>> Resource_List.ncpus = 1 >>>>> Resource_List.nodect = 60 >>>>> Resource_List.nodes = 60 >>>>> Resource_List.walltime = 10000:00:00 >>>>> Variable_List = PBS_O_HOME=/home/faculty/nilesh.mistry, >>>>> PBS_O_LANG=en_CA.UTF-8,PBS_O_LOGNAME=nilesh.mistry, >>>>> >>>>> PBS_O_PATH=/usr/kerberos/bin:/opt/lam-7.1.2/bin:/usr/local/bin:/bin:/u >>>>> >>>>> sr/bin:/usr/X11R6/bin:/opt/env-switcher/bin:/opt/pvm3/lib:/opt/pvm3/lib >>>>> >>>>> /LINUX:/opt/pvm3/bin/LINUX:/opt/pbs/bin:/opt/pbs/lib/xpbs/bin:/opt/c3-4 >>>>> >>>>> /:/home/faculty/nilesh.mistry/bin:/opt/maui/bin:/usr/lib/news/bin:/home >>>>> /faculty/nilesh.mistry/scripts, >>>>> PBS_O_MAIL=/var/spool/mail/nilesh.mistry,PBS_O_SHELL=/bin/bash, >>>>> PBS_O_HOST=master.atar.senecac.on.ca, >>>>> PBS_O_WORKDIR=/home/faculty/nilesh.mistry/pbs/multi/scaling_test, >>>>> PBS_O_QUEUE=parallel >>>>> etime = Tue Sep 18 09:09:45 2007 >>>>> >>>>> ###################### Log file ############################## >>>>> >>>>> ------------------------------------------------------ >>>>> This job is allocated on 60 cpu(s) >>>>> Job is running on node(s): >>>>> atarnode59.atar.senecac.on.ca >>>>> atarnode59.atar.senecac.on.ca >>>>> atarnode57.atar.senecac.on.ca >>>>> atarnode57.atar.senecac.on.ca >>>>> atarnode56.atar.senecac.on.ca >>>>> atarnode56.atar.senecac.on.ca >>>>> atarnode55.atar.senecac.on.ca >>>>> atarnode55.atar.senecac.on.ca >>>>> atarnode54.atar.senecac.on.ca >>>>> atarnode54.atar.senecac.on.ca >>>>> atarnode53.atar.senecac.on.ca >>>>> atarnode53.atar.senecac.on.ca >>>>> atarnode52.atar.senecac.on.ca >>>>> atarnode52.atar.senecac.on.ca >>>>> atarnode51.atar.senecac.on.ca >>>>> atarnode51.atar.senecac.on.ca >>>>> atarnode50.atar.senecac.on.ca >>>>> atarnode50.atar.senecac.on.ca >>>>> atarnode50.atar.senecac.on.ca >>>>> atarnode49.atar.senecac.on.ca >>>>> atarnode49.atar.senecac.on.ca >>>>> atarnode49.atar.senecac.on.ca >>>>> atarnode48.atar.senecac.on.ca >>>>> atarnode48.atar.senecac.on.ca >>>>> atarnode48.atar.senecac.on.ca >>>>> atarnode47.atar.senecac.on.ca >>>>> atarnode47.atar.senecac.on.ca >>>>> atarnode47.atar.senecac.on.ca >>>>> atarnode45.atar.senecac.on.ca >>>>> atarnode45.atar.senecac.on.ca >>>>> atarnode45.atar.senecac.on.ca >>>>> atarnode44.atar.senecac.on.ca >>>>> atarnode44.atar.senecac.on.ca >>>>> atarnode44.atar.senecac.on.ca >>>>> atarnode42.atar.senecac.on.ca >>>>> atarnode42.atar.senecac.on.ca >>>>> atarnode42.atar.senecac.on.ca >>>>> atarnode41.atar.senecac.on.ca >>>>> atarnode41.atar.senecac.on.ca >>>>> atarnode41.atar.senecac.on.ca >>>>> atarnode40.atar.senecac.on.ca >>>>> atarnode40.atar.senecac.on.ca >>>>> atarnode40.atar.senecac.on.ca >>>>> atarnode39.atar.senecac.on.ca >>>>> atarnode39.atar.senecac.on.ca >>>>> atarnode39.atar.senecac.on.ca >>>>> atarnode38.atar.senecac.on.ca >>>>> atarnode38.atar.senecac.on.ca >>>>> atarnode38.atar.senecac.on.ca >>>>> atarnode37.atar.senecac.on.ca >>>>> atarnode37.atar.senecac.on.ca >>>>> atarnode37.atar.senecac.on.ca >>>>> atarnode36.atar.senecac.on.ca >>>>> atarnode36.atar.senecac.on.ca >>>>> atarnode36.atar.senecac.on.ca >>>>> atarnode35.atar.senecac.on.ca >>>>> atarnode35.atar.senecac.on.ca >>>>> atarnode35.atar.senecac.on.ca >>>>> atarnode34.atar.senecac.on.ca >>>>> atarnode34.atar.senecac.on.ca >>>>> PBS: qsub is running on master.atar.senecac.on.ca >>>>> PBS: originating queue is parallel >>>>> PBS: executing queue is parallel >>>>> PBS: working directory is >>>>> /home/faculty/nilesh.mistry/pbs/multi/scaling_test >>>>> PBS: execution mode is PBS_BATCH >>>>> PBS: job identifier is 924.master.atar.senecac.on.ca >>>>> PBS: job name is scaling_test >>>>> PBS: node file is /var/spool/pbs/aux//924.master.atar.senecac.on.ca >>>>> PBS: current home directory is /home/faculty/nilesh.mistry >>>>> PBS: PATH = >>>>> /usr/kerberos/bin:/opt/lam-7.1.2/bin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin:/opt/env-switcher/bin:/opt/pvm3/lib:/opt/pvm3/lib/LINUX:/opt/pvm3/bi >>>>> n/LINUX:/opt/pbs/bin:/opt/pbs/lib/xpbs/bin:/opt/c3-4/:/home/faculty/nilesh.mistry/bin:/opt/maui/bin:/usr/lib/news/bin:/home/faculty/nilesh.mistry/scripts >>>>> ------------------------------------------------------ >>>>> Mesh 1 of 60 is alive on atarnode59.atar.senecac.on.ca >>>>> Mesh 17 of 60 is alive on atarnode50.atar.senecac.on.ca >>>>> Mesh 18 of 60 is alive on atarnode50.atar.senecac.on.ca >>>>> Mesh 3 of 60 is alive on atarnode57.atar.senecac.on.ca >>>>> Mesh 50 of 60 is alive on atarnode37.atar.senecac.on.ca >>>>> Mesh 51 of 60 is alive on atarnode37.atar.senecac.on.ca >>>>> Mesh 58 of 60 is alive on atarnode35.atar.senecac.on.ca >>>>> Mesh 15 of 60 is alive on atarnode51.atar.senecac.on.ca >>>>> Mesh 56 of 60 is alive on atarnode35.atar.senecac.on.ca >>>>> Mesh 47 of 60 is alive on atarnode38.atar.senecac.on.ca >>>>> Mesh 41 of 60 is alive on atarnode40.atar.senecac.on.ca >>>>> Mesh 43 of 60 is alive on atarnode40.atar.senecac.on.ca >>>>> Mesh 23 of 60 is alive on atarnode48.atar.senecac.on.ca >>>>> Mesh 59 of 60 is alive on atarnode34.atar.senecac.on.ca >>>>> Mesh 44 of 60 is alive on atarnode39.atar.senecac.on.ca >>>>> Mesh 60 of 60 is alive on atarnode34.atar.senecac.on.ca >>>>> Mesh 26 of 60 is alive on atarnode47.atar.senecac.on.ca >>>>> Mesh 46 of 60 is alive on atarnode39.atar.senecac.on.ca >>>>> Mesh 42 of 60 is alive on atarnode40.atar.senecac.on.ca >>>>> Mesh 32 of 60 is alive on atarnode44.atar.senecac.on.ca >>>>> Mesh 20 of 60 is alive on atarnode49.atar.senecac.on.ca >>>>> Mesh 35 of 60 is alive on atarnode42.atar.senecac.on.ca >>>>> Mesh 53 of 60 is alive on atarnode36.atar.senecac.on.ca >>>>> Mesh 22 of 60 is alive on atarnode49.atar.senecac.on.ca >>>>> Mesh 19 of 60 is alive on atarnode50.atar.senecac.on.ca >>>>> Mesh 48 of 60 is alive on atarnode38.atar.senecac.on.ca >>>>> Mesh 37 of 60 is alive on atarnode42.atar.senecac.on.ca >>>>> Mesh 54 of 60 is alive on atarnode36.atar.senecac.on.ca >>>>> Mesh 55 of 60 is alive on atarnode36.atar.senecac.on.ca >>>>> Mesh 45 of 60 is alive on atarnode39.atar.senecac.on.ca >>>>> Mesh 29 of 60 is alive on atarnode45.atar.senecac.on.ca >>>>> Mesh 24 of 60 is alive on atarnode48.atar.senecac.on.ca >>>>> Mesh 30 of 60 is alive on atarnode45.atar.senecac.on.ca >>>>> Mesh 31 of 60 is alive on atarnode45.atar.senecac.on.ca >>>>> Mesh 52 of 60 is alive on atarnode37.atar.senecac.on.ca >>>>> Mesh 28 of 60 is alive on atarnode47.atar.senecac.on.ca >>>>> Mesh 36 of 60 is alive on atarnode42.atar.senecac.on.ca >>>>> Mesh 34 of 60 is alive on atarnode44.atar.senecac.on.ca >>>>> Mesh 38 of 60 is alive on atarnode41.atar.senecac.on.ca >>>>> Mesh 40 of 60 is alive on atarnode41.atar.senecac.on.ca >>>>> Mesh 5 of 60 is alive on atarnode56.atar.senecac.on.ca >>>>> Mesh 57 of 60 is alive on atarnode35.atar.senecac.on.ca >>>>> Mesh 13 of 60 is alive on atarnode52.atar.senecac.on.ca >>>>> Mesh 9 of 60 is alive on atarnode54.atar.senecac.on.ca >>>>> Mesh 39 of 60 is alive on atarnode41.atar.senecac.on.ca >>>>> Mesh 7 of 60 is alive on atarnode55.atar.senecac.on.ca >>>>> Mesh 10 of 60 is alive on atarnode54.atar.senecac.on.ca >>>>> Mesh 8 of 60 is alive on atarnode55.atar.senecac.on.ca >>>>> Mesh 4 of 60 is alive on atarnode57.atar.senecac.on.ca >>>>> Mesh 6 of 60 is alive on atarnode56.atar.senecac.on.ca >>>>> Mesh 11 of 60 is alive on atarnode53.atar.senecac.on.ca >>>>> Mesh 14 of 60 is alive on atarnode52.atar.senecac.on.ca >>>>> Mesh 12 of 60 is alive on atarnode53.atar.senecac.on.ca >>>>> Mesh 21 of 60 is alive on atarnode49.atar.senecac.on.ca >>>>> Mesh 16 of 60 is alive on atarnode51.atar.senecac.on.ca >>>>> Mesh 33 of 60 is alive on atarnode44.atar.senecac.on.ca >>>>> Mesh 49 of 60 is alive on atarnode38.atar.senecac.on.ca >>>>> Mesh 25 of 60 is alive on atarnode48.atar.senecac.on.ca >>>>> Mesh 27 of 60 is alive on atarnode47.atar.senecac.on.ca >>>>> Mesh 2 of 60 is alive on atarnode59.atar.senecac.on.ca >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> ERROR: Number of meshes not equal to number of threads >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> ERROR: Number of meshes not equal to number of threads >>>>> ERROR: Number of meshes not equal to number of threads >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> ERROR: Number of meshes not equal to number of threads >>>>> >>>>> LAM 7.1.2/MPI 2 C++/ROMIO - Indiana University >>>>> >>>>> >>>>> Thanks >>>>> >>>>> Nilesh Mistry >>>>> Academic Computing Services >>>>> [EMAIL PROTECTED] & TEL Campus >>>>> Seneca College Of Applies Arts & Technology >>>>> 70 The Pond Road >>>>> Toronto, Ontario >>>>> M3J 3M6 Canada >>>>> Phone 416 491 5050 ext 3788 >>>>> Fax 416 661 4695 >>>>> http://acs.senecac.on.ca >>>>> >>>>> >>>>> >>>>> Michael Edwards wrote: >>>>> >>>>> >>>>> >>>>>> What do you get when you do "qstat -f" on the job? How many nodes is >>>>>> it actually getting? >>>>>> >>>>>> On 9/18/07, Nilesh Mistry <[EMAIL PROTECTED]> wrote: >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> Micheal >>>>>>> >>>>>>> We have actually moved to a larger cluster of 64 nodes (50 quad core and >>>>>>> 14 dual opterons), there fore 220 processors available. We are >>>>>>> submitting a job that requires 64 threads but still with the same >>>>>>> result. Here are the files you requested. I have already submitted to >>>>>>> torque user list. >>>>>>> >>>>>>> ####### PBS SCRIPT START####### >>>>>>> >>>>>>> #!/bin/sh -f >>>>>>> #PBS -l nodes=64 >>>>>>> #PBS -N scaling_test >>>>>>> #PBS -e scaling_test.err >>>>>>> #PBS -o scaling_test.log >>>>>>> #PBS -j oe >>>>>>> #PBS -l mem=64000mb >>>>>>> #PBS -m abe >>>>>>> #PBS -q parallel >>>>>>> >>>>>>> NCPU=`wc -l < $PBS_NODEFILE` >>>>>>> echo ------------------------------------------------------ >>>>>>> echo ' This job is allocated on '${NCPU}' cpu(s)' >>>>>>> echo 'Job is running on node(s): ' >>>>>>> cat $PBS_NODEFILE >>>>>>> echo PBS: qsub is running on $PBS_O_HOST >>>>>>> echo PBS: originating queue is $PBS_O_QUEUE >>>>>>> echo PBS: executing queue is $PBS_QUEUE >>>>>>> echo PBS: working directory is $PBS_O_WORKDIR >>>>>>> echo PBS: execution mode is $PBS_ENVIRONMENT >>>>>>> echo PBS: job identifier is $PBS_JOBID >>>>>>> echo PBS: job name is $PBS_JOBNAME >>>>>>> echo PBS: node file is $PBS_NODEFILE >>>>>>> echo PBS: current home directory is $PBS_O_HOME >>>>>>> echo PBS: PATH = $PBS_O_PATH >>>>>>> echo ------------------------------------------------------ >>>>>>> SERVER=$PBS_O_HOST >>>>>>> WORKDIR=$HOME/pbs/multi/scaling_test >>>>>>> cd ${WORKDIR} >>>>>>> cat $PBS_NODEFILE > nodes.list >>>>>>> lamboot -s -H $PBS_NODEFILE >>>>>>> mpirun -np $NCPU /opt/fds/fds5_mpi scaling_test.fds >>>>>>> lamhalt >>>>>>> >>>>>>> ####### PBS SCRIPT END ####### >>>>>>> >>>>>>> ####### MAUI.CFG START ####### >>>>>>> # maui.cfg 3.2.6p14 >>>>>>> >>>>>>> SERVERHOST master.atar.senecac.on.ca >>>>>>> # primary admin must be first in list >>>>>>> ADMIN1 root >>>>>>> ADMIN3 nilesh.mistry >>>>>>> >>>>>>> >>>>>>> # Resource Manager Definition >>>>>>> >>>>>>> RMCFG[master.atar.senecac.on.ca] TYPE=PBS >>>>>>> >>>>>>> # Allocation Manager Definition >>>>>>> >>>>>>> AMCFG[bank] TYPE=NONE >>>>>>> >>>>>>> # full parameter docs at >>>>>>> http://clusterresources.com/mauidocs/a.fparameters.html >>>>>>> # use the 'schedctl -l' command to display current configuration >>>>>>> >>>>>>> RMPOLLINTERVAL 00:01:00 >>>>>>> >>>>>>> SERVERPORT 42559 >>>>>>> SERVERMODE NORMAL >>>>>>> >>>>>>> # Admin: http://clusterresources.com/mauidocs/a.esecurity.html >>>>>>> >>>>>>> >>>>>>> LOGFILE maui.log >>>>>>> LOGFILEMAXSIZE 10000000 >>>>>>> LOGLEVEL 4 >>>>>>> LOGFACILITY fALL >>>>>>> >>>>>>> # Job Priority: >>>>>>> http://clusterresources.com/mauidocs/5.1jobprioritization.html >>>>>>> >>>>>>> QUEUETIMEWEIGHT 1 >>>>>>> >>>>>>> # FairShare: http://clusterresources.com/mauidocs/6.3fairshare.html >>>>>>> >>>>>>> #FSPOLICY PSDEDICATED >>>>>>> #FSDEPTH 7 >>>>>>> #FSINTERVAL 86400 >>>>>>> #FSDECAY 0.80 >>>>>>> >>>>>>> # Throttling Policies: >>>>>>> http://clusterresources.com/mauidocs/6.2throttlingpolicies.html >>>>>>> >>>>>>> # NONE SPECIFIED >>>>>>> >>>>>>> # Backfill: http://clusterresources.com/mauidocs/8.2backfill.html >>>>>>> >>>>>>> BACKFILLPOLICY ON >>>>>>> RESERVATIONPOLICY CURRENTHIGHEST >>>>>>> >>>>>>> # the following are modified/added by Mehrdad 13 Sept 07 >>>>>>> #NODEACCESSPOLICY DEDICATED >>>>>>> NODEACCESSPOLICY SHARED >>>>>>> JOBNODEMATCHPOLICY EXACTPROC >>>>>>> >>>>>>> # Node Allocation: >>>>>>> http://clusterresources.com/mauidocs/5.2nodeallocation.html >>>>>>> >>>>>>> NODEALLOCATIONPOLICY MINRESOURCE >>>>>>> >>>>>>> # QOS: http://clusterresources.com/mauidocs/7.3qos.html >>>>>>> >>>>>>> # QOSCFG[hi] PRIORITY=100 XFTARGET=100 FLAGS=PREEMPTOR:IGNMAXJOB >>>>>>> # QOSCFG[low] PRIORITY=-1000 FLAGS=PREEMPTEE >>>>>>> >>>>>>> # Standing Reservations: >>>>>>> http://clusterresources.com/mauidocs/7.1.3standingreservations.html >>>>>>> >>>>>>> # SRSTARTTIME[test] 8:00:00 >>>>>>> # SRENDTIME[test] 17:00:00 >>>>>>> # SRDAYS[test] MON TUE WED THU FRI >>>>>>> # SRTASKCOUNT[test] 20 >>>>>>> # SRMAXTIME[test] 0:30:00 >>>>>>> >>>>>>> # Creds: http://clusterresources.com/mauidocs/6.1fairnessoverview.html >>>>>>> >>>>>>> # USERCFG[DEFAULT] FSTARGET=25.0 >>>>>>> # USERCFG[john] PRIORITY=100 FSTARGET=10.0- >>>>>>> # GROUPCFG[staff] PRIORITY=1000 QLIST=hi:low QDEF=hi >>>>>>> # CLASSCFG[batch] FLAGS=PREEMPTEE >>>>>>> # CLASSCFG[interactive] FLAGS=PREEMPTOR >>>>>>> USERCFG[DEFAULT] MAXJOB=4 >>>>>>> ####### MAUI.CFG END ####### >>>>>>> >>>>>>> ####### QMGR -c "PRINT SERVER MASTER" ######## >>>>>>> # >>>>>>> # Create queues and set their attributes. >>>>>>> # >>>>>>> # >>>>>>> # Create and define queue serial >>>>>>> # >>>>>>> create queue serial >>>>>>> set queue serial queue_type = Execution >>>>>>> set queue serial resources_max.cput = 1000:00:00 >>>>>>> set queue serial resources_max.mem = 3000mb >>>>>>> set queue serial resources_max.ncpus = 1 >>>>>>> set queue serial resources_max.nodect = 1 >>>>>>> set queue serial resources_max.nodes = 1:ppn=1 >>>>>>> set queue serial resources_max.walltime = 1000:00:00 >>>>>>> set queue serial resources_default.cput = 336:00:00 >>>>>>> set queue serial resources_default.mem = 900mb >>>>>>> set queue serial resources_default.ncpus = 1 >>>>>>> set queue serial resources_default.nodect = 1 >>>>>>> set queue serial resources_default.nodes = 1:ppn=1 >>>>>>> set queue serial enabled = True >>>>>>> set queue serial started = True >>>>>>> # >>>>>>> # Create and define queue workq >>>>>>> # >>>>>>> create queue workq >>>>>>> set queue workq queue_type = Execution >>>>>>> set queue workq resources_max.cput = 10000:00:00 >>>>>>> set queue workq resources_max.ncpus = 200 >>>>>>> set queue workq resources_max.nodect = 64 >>>>>>> set queue workq resources_max.nodes = 200:ppn=4 >>>>>>> set queue workq resources_max.walltime = 10000:00:00 >>>>>>> set queue workq resources_min.cput = 00:00:01 >>>>>>> set queue workq resources_min.ncpus = 1 >>>>>>> set queue workq resources_min.nodect = 1 >>>>>>> set queue workq resources_min.walltime = 00:00:01 >>>>>>> set queue workq resources_default.cput = 10000:00:00 >>>>>>> set queue workq resources_default.nodect = 1 >>>>>>> set queue workq resources_default.walltime = 10000:00:00 >>>>>>> set queue workq enabled = True >>>>>>> set queue workq started = True >>>>>>> # >>>>>>> # Create and define queue parallel >>>>>>> # >>>>>>> create queue parallel >>>>>>> set queue parallel queue_type = Execution >>>>>>> set queue parallel resources_max.cput = 10000:00:00 >>>>>>> set queue parallel resources_max.ncpus = 200 >>>>>>> set queue parallel resources_max.nodect = 64 >>>>>>> set queue parallel resources_max.nodes = 200:ppn=4 >>>>>>> set queue parallel resources_max.walltime = 10000:00:00 >>>>>>> set queue parallel resources_min.ncpus = 1 >>>>>>> set queue parallel resources_min.nodect = 1 >>>>>>> set queue parallel resources_default.ncpus = 1 >>>>>>> set queue parallel resources_default.nodect = 1 >>>>>>> set queue parallel resources_default.nodes = 1:ppn=1 >>>>>>> set queue parallel resources_default.walltime = 10000:00:00 >>>>>>> set queue parallel enabled = True >>>>>>> set queue parallel started = True >>>>>>> # >>>>>>> # Set server attributes. >>>>>>> # >>>>>>> set server scheduling = True >>>>>>> set server acl_host_enable = False >>>>>>> set server acl_user_enable = False >>>>>>> set server default_queue = serial >>>>>>> set server log_events = 127 >>>>>>> set server mail_from = adm >>>>>>> set server query_other_jobs = True >>>>>>> set server resources_available.ncpus = 200 >>>>>>> set server resources_available.nodect = 64 >>>>>>> set server resources_available.nodes = 200 >>>>>>> set server resources_default.neednodes = 1 >>>>>>> set server resources_default.nodect = 1 >>>>>>> set server resources_default.nodes = 1 >>>>>>> set server resources_max.ncpus = 200 >>>>>>> set server resources_max.nodes = 200 >>>>>>> set server scheduler_iteration = 60 >>>>>>> set server node_check_rate = 150 >>>>>>> set server tcp_timeout = 6 >>>>>>> set server default_node = 1 >>>>>>> set server pbs_version = 2.0.0p8 >>>>>>> >>>>>>> >>>>>>> >>>>>>> Thanks >>>>>>> >>>>>>> Nilesh Mistry >>>>>>> Academic Computing Services >>>>>>> [EMAIL PROTECTED] & TEL Campus >>>>>>> Seneca College Of Applies Arts & Technology >>>>>>> 70 The Pond Road >>>>>>> Toronto, Ontario >>>>>>> M3J 3M6 Canada >>>>>>> Phone 416 491 5050 ext 3788 >>>>>>> Fax 416 661 4695 >>>>>>> http://acs.senecac.on.ca >>>>>>> >>>>>>> >>>>>>> >>>>>>> Michael Edwards wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> We'd need your script and the qsub command you used, possibly more >>>>>>>> configuration information from maui and torque, to be much help. >>>>>>>> >>>>>>>> I don't know that we have anyone who is deep with maui or torque right >>>>>>>> now, you might also want to ask on the maui or torque lists. >>>>>>>> >>>>>>>> >From the other posts you have made this error seems to be one of those >>>>>>>> general "Something is Broken" messages that could have many causes. >>>>>>>> >>>>>>>> On 9/17/07, Nilesh Mistry <[EMAIL PROTECTED]> wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> Hello >>>>>>>>> >>>>>>>>> I am having problems submitting job that requires 23 threads. I keep >>>>>>>>> getting the following error: >>>>>>>>> >>>>>>>>> ERROR: Number of meshes not equal to number of thread >>>>>>>>> >>>>>>>>> Hardware: >>>>>>>>> 10 quad core nodes (therefore 40 processors available) >>>>>>>>> >>>>>>>>> What do I need to insure in my job queue (qmgr) , maui (maui.cfg) and >>>>>>>>> my submit script when using qsub? >>>>>>>>> >>>>>>>>> Any and all help is greatly appreciated. >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Thanks >>>>>>>>> >>>>>>>>> Nilesh Mistry >>>>>>>>> Academic Computing Services >>>>>>>>> [EMAIL PROTECTED] & TEL Campus >>>>>>>>> Seneca College Of Applies Arts & Technology >>>>>>>>> 70 The Pond Road >>>>>>>>> Toronto, Ontario >>>>>>>>> M3J 3M6 Canada >>>>>>>>> Phone 416 491 5050 ext 3788 >>>>>>>>> Fax 416 661 4695 >>>>>>>>> http://acs.senecac.on.ca >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> ------------------------------------------------------------------------- >>>>>>>>> This SF.net email is sponsored by: Microsoft >>>>>>>>> Defy all challenges. Microsoft(R) Visual Studio 2005. >>>>>>>>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>>>>>>>> _______________________________________________ >>>>>>>>> Oscar-users mailing list >>>>>>>>> Oscar-users@lists.sourceforge.net >>>>>>>>> https://lists.sourceforge.net/lists/listinfo/oscar-users >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> ------------------------------------------------------------------------- >>>>>>>> This SF.net email is sponsored by: Microsoft >>>>>>>> Defy all challenges. Microsoft(R) Visual Studio 2005. >>>>>>>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>>>>>>> _______________________________________________ >>>>>>>> Oscar-users mailing list >>>>>>>> Oscar-users@lists.sourceforge.net >>>>>>>> https://lists.sourceforge.net/lists/listinfo/oscar-users >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> ------------------------------------------------------------------------- >>>>>>> This SF.net email is sponsored by: Microsoft >>>>>>> Defy all challenges. Microsoft(R) Visual Studio 2005. >>>>>>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>>>>>> _______________________________________________ >>>>>>> Oscar-users mailing list >>>>>>> Oscar-users@lists.sourceforge.net >>>>>>> https://lists.sourceforge.net/lists/listinfo/oscar-users >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> ------------------------------------------------------------------------- >>>>>> This SF.net email is sponsored by: Microsoft >>>>>> Defy all challenges. Microsoft(R) Visual Studio 2005. >>>>>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>>>>> _______________________________________________ >>>>>> Oscar-users mailing list >>>>>> Oscar-users@lists.sourceforge.net >>>>>> https://lists.sourceforge.net/lists/listinfo/oscar-users >>>>>> >>>>>> >>>>>> >>>>>> >>>>> ------------------------------------------------------------------------- >>>>> This SF.net email is sponsored by: Microsoft >>>>> Defy all challenges. Microsoft(R) Visual Studio 2005. >>>>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>>>> _______________________________________________ >>>>> Oscar-users mailing list >>>>> Oscar-users@lists.sourceforge.net >>>>> https://lists.sourceforge.net/lists/listinfo/oscar-users >>>>> >>>>> >>>>> >>>>> >>>> ------------------------------------------------------------------------- >>>> This SF.net email is sponsored by: Microsoft >>>> Defy all challenges. Microsoft(R) Visual Studio 2005. >>>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>>> _______________________________________________ >>>> Oscar-users mailing list >>>> Oscar-users@lists.sourceforge.net >>>> https://lists.sourceforge.net/lists/listinfo/oscar-users >>>> >>>> >>>> >>> ------------------------------------------------------------------------- >>> This SF.net email is sponsored by: Microsoft >>> Defy all challenges. Microsoft(R) Visual Studio 2005. >>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>> _______________________________________________ >>> Oscar-users mailing list >>> Oscar-users@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/oscar-users >>> >>> >> ------------------------------------------------------------------------- >> This SF.net email is sponsored by: Microsoft >> Defy all challenges. Microsoft(R) Visual Studio 2005. >> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >> _______________________________________________ >> Oscar-users mailing list >> Oscar-users@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/oscar-users >> >> > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2005. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Oscar-users mailing list > Oscar-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/oscar-users > ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Oscar-users mailing list Oscar-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/oscar-users