Hello I was checking the pbs server setting (qmgr -c "list server") and notice the following value:
resources_assigned.ncpus = 2 resources_assigned.nodect = 24 Could these 2 values be my problem? I have been trying to changes these as root but getting: qmgr obj=master svr=master: Cannot set attribute, read only or insufficient permission resources_assigned.nodect* * Thanks Nilesh Mistry Academic Computing Services [EMAIL PROTECTED] & TEL Campus Seneca College Of Applies Arts & Technology 70 The Pond Road Toronto, Ontario M3J 3M6 Canada Phone 416 491 5050 ext 3788 Fax 416 661 4695 http://acs.senecac.on.ca Nilesh Mistry wrote: > Hello Michael > > I have tried this as well with the same result. > > I have even set properties for the quad core nodes and specified them in > the pbs script to only select those nodes, but still the same problem. > > I have created a new queue adding the properties of the quad core nodes > to resources_max.nodes = 228:ppn=4:i965, however still the same result. > > Thanks > > Nilesh Mistry > Academic Computing Services > [EMAIL PROTECTED] & TEL Campus > Seneca College Of Applies Arts & Technology > 70 The Pond Road > Toronto, Ontario > M3J 3M6 Canada > Phone 416 491 5050 ext 3788 > Fax 416 661 4695 > http://acs.senecac.on.ca > > > > Michael Edwards wrote: > >> set queue workq resources_max.ncpus = 200 >> set queue workq resources_max.nodect = 64 >> set queue workq resources_max.nodes = 200:ppn=4 >> >> This should probably be 50*4 + 14*2 = 228 >> >> set queue workq resources_max.ncpus = 228 >> set queue workq resources_max.nodect = 64 >> set queue workq resources_max.nodes = 228:ppn=4 >> >> Though you might want to try making two queues, I don't know how well >> torque deals with having different numbers of ppn on different nodes. >> >> On 9/18/07, Nilesh Mistry <[EMAIL PROTECTED]> wrote: >> >> >>> I the following error after the using qsub >>> >>> qsub: Job exceeds queue resource limits >>> >>> If I change #PBS -l nodes=64 to #PBS -l nodes=60 i get the job submitted >>> and running and the if fails >>> >>> ################ qstat -f ############################ >>> >>> Job Id: 924.master.atar.senecac.on.ca >>> Job_Name = scaling_test >>> Job_Owner = [EMAIL PROTECTED] >>> job_state = R >>> queue = parallel >>> server = master.atar.senecac.on.ca >>> Checkpoint = u >>> ctime = Tue Sep 18 09:09:45 2007 >>> Error_Path = >>> master:/home/faculty/nilesh.mistry/pbs/multi/scaling_test/scal >>> ing_test.err >>> exec_host = >>> atarnode59.atar.senecac.on.ca/1+atarnode59.atar.senecac.on.ca/0 >>> >>> +atarnode57.atar.senecac.on.ca/1+atarnode57.atar.senecac.on.ca/0+atarno >>> >>> de56.atar.senecac.on.ca/1+atarnode56.atar.senecac.on.ca/0+atarnode55.at >>> >>> ar.senecac.on.ca/1+atarnode55.atar.senecac.on.ca/0+atarnode54.atar.sene >>> >>> cac.on.ca/1+atarnode54.atar.senecac.on.ca/0+atarnode53.atar.senecac.on. >>> >>> ca/1+atarnode53.atar.senecac.on.ca/0+atarnode52.atar.senecac.on.ca/1+at >>> >>> arnode52.atar.senecac.on.ca/0+atarnode51.atar.senecac.on.ca/1+atarnode5 >>> >>> 1.atar.senecac.on.ca/0+atarnode50.atar.senecac.on.ca/2+atarnode50.atar. >>> >>> senecac.on.ca/1+atarnode50.atar.senecac.on.ca/0+atarnode49.atar.senecac >>> >>> .on.ca/2+atarnode49.atar.senecac.on.ca/1+atarnode49.atar.senecac.on.ca/ >>> >>> 0+atarnode48.atar.senecac.on.ca/2+atarnode48.atar.senecac.on.ca/1+atarn >>> >>> ode48.atar.senecac.on.ca/0+atarnode47.atar.senecac.on.ca/2+atarnode47.a >>> >>> tar.senecac.on.ca/1+atarnode47.atar.senecac.on.ca/0+atarnode45.atar.sen >>> >>> ecac.on.ca/2+atarnode45.atar.senecac.on.ca/1+atarnode45.atar.senecac.on >>> >>> .ca/0+atarnode44.atar.senecac.on.ca/2+atarnode44.atar.senecac.on.ca/1+a >>> >>> tarnode44.atar.senecac.on.ca/0+atarnode42.atar.senecac.on.ca/2+atarnode >>> >>> 42.atar.senecac.on.ca/1+atarnode42.atar.senecac.on.ca/0+atarnode41.atar >>> >>> .senecac.on.ca/2+atarnode41.atar.senecac.on.ca/1+atarnode41.atar.seneca >>> >>> c.on.ca/0+atarnode40.atar.senecac.on.ca/2+atarnode40.atar.senecac.on.ca >>> >>> /1+atarnode40.atar.senecac.on.ca/0+atarnode39.atar.senecac.on.ca/2+atar >>> >>> node39.atar.senecac.on.ca/1+atarnode39.atar.senecac.on.ca/0+atarnode38. >>> >>> atar.senecac.on.ca/2+atarnode38.atar.senecac.on.ca/1+atarnode38.atar.se >>> >>> necac.on.ca/0+atarnode37.atar.senecac.on.ca/2+atarnode37.atar.senecac.o >>> >>> n.ca/1+atarnode37.atar.senecac.on.ca/0+atarnode36.atar.senecac.on.ca/2+ >>> >>> atarnode36.atar.senecac.on.ca/1+atarnode36.atar.senecac.on.ca/0+atarnod >>> >>> e35.atar.senecac.on.ca/2+atarnode35.atar.senecac.on.ca/1+atarnode35.ata >>> >>> r.senecac.on.ca/0+atarnode34.atar.senecac.on.ca/1+atarnode34.atar.senec >>> ac.on.ca/0 >>> Hold_Types = n >>> Join_Path = oe >>> Keep_Files = n >>> Mail_Points = abe >>> Mail_Users = nilesh.mistry >>> mtime = Tue Sep 18 09:09:46 2007 >>> Output_Path = >>> master:/home/faculty/nilesh.mistry/pbs/multi/scaling_test/sca >>> ling_test.log >>> Priority = 0 >>> qtime = Tue Sep 18 09:09:45 2007 >>> Rerunable = True >>> Resource_List.cput = 10000:00:00 >>> Resource_List.mem = 64000mb >>> Resource_List.ncpus = 1 >>> Resource_List.nodect = 60 >>> Resource_List.nodes = 60 >>> Resource_List.walltime = 10000:00:00 >>> Variable_List = PBS_O_HOME=/home/faculty/nilesh.mistry, >>> PBS_O_LANG=en_CA.UTF-8,PBS_O_LOGNAME=nilesh.mistry, >>> >>> PBS_O_PATH=/usr/kerberos/bin:/opt/lam-7.1.2/bin:/usr/local/bin:/bin:/u >>> >>> sr/bin:/usr/X11R6/bin:/opt/env-switcher/bin:/opt/pvm3/lib:/opt/pvm3/lib >>> >>> /LINUX:/opt/pvm3/bin/LINUX:/opt/pbs/bin:/opt/pbs/lib/xpbs/bin:/opt/c3-4 >>> >>> /:/home/faculty/nilesh.mistry/bin:/opt/maui/bin:/usr/lib/news/bin:/home >>> /faculty/nilesh.mistry/scripts, >>> PBS_O_MAIL=/var/spool/mail/nilesh.mistry,PBS_O_SHELL=/bin/bash, >>> PBS_O_HOST=master.atar.senecac.on.ca, >>> PBS_O_WORKDIR=/home/faculty/nilesh.mistry/pbs/multi/scaling_test, >>> PBS_O_QUEUE=parallel >>> etime = Tue Sep 18 09:09:45 2007 >>> >>> ###################### Log file ############################## >>> >>> ------------------------------------------------------ >>> This job is allocated on 60 cpu(s) >>> Job is running on node(s): >>> atarnode59.atar.senecac.on.ca >>> atarnode59.atar.senecac.on.ca >>> atarnode57.atar.senecac.on.ca >>> atarnode57.atar.senecac.on.ca >>> atarnode56.atar.senecac.on.ca >>> atarnode56.atar.senecac.on.ca >>> atarnode55.atar.senecac.on.ca >>> atarnode55.atar.senecac.on.ca >>> atarnode54.atar.senecac.on.ca >>> atarnode54.atar.senecac.on.ca >>> atarnode53.atar.senecac.on.ca >>> atarnode53.atar.senecac.on.ca >>> atarnode52.atar.senecac.on.ca >>> atarnode52.atar.senecac.on.ca >>> atarnode51.atar.senecac.on.ca >>> atarnode51.atar.senecac.on.ca >>> atarnode50.atar.senecac.on.ca >>> atarnode50.atar.senecac.on.ca >>> atarnode50.atar.senecac.on.ca >>> atarnode49.atar.senecac.on.ca >>> atarnode49.atar.senecac.on.ca >>> atarnode49.atar.senecac.on.ca >>> atarnode48.atar.senecac.on.ca >>> atarnode48.atar.senecac.on.ca >>> atarnode48.atar.senecac.on.ca >>> atarnode47.atar.senecac.on.ca >>> atarnode47.atar.senecac.on.ca >>> atarnode47.atar.senecac.on.ca >>> atarnode45.atar.senecac.on.ca >>> atarnode45.atar.senecac.on.ca >>> atarnode45.atar.senecac.on.ca >>> atarnode44.atar.senecac.on.ca >>> atarnode44.atar.senecac.on.ca >>> atarnode44.atar.senecac.on.ca >>> atarnode42.atar.senecac.on.ca >>> atarnode42.atar.senecac.on.ca >>> atarnode42.atar.senecac.on.ca >>> atarnode41.atar.senecac.on.ca >>> atarnode41.atar.senecac.on.ca >>> atarnode41.atar.senecac.on.ca >>> atarnode40.atar.senecac.on.ca >>> atarnode40.atar.senecac.on.ca >>> atarnode40.atar.senecac.on.ca >>> atarnode39.atar.senecac.on.ca >>> atarnode39.atar.senecac.on.ca >>> atarnode39.atar.senecac.on.ca >>> atarnode38.atar.senecac.on.ca >>> atarnode38.atar.senecac.on.ca >>> atarnode38.atar.senecac.on.ca >>> atarnode37.atar.senecac.on.ca >>> atarnode37.atar.senecac.on.ca >>> atarnode37.atar.senecac.on.ca >>> atarnode36.atar.senecac.on.ca >>> atarnode36.atar.senecac.on.ca >>> atarnode36.atar.senecac.on.ca >>> atarnode35.atar.senecac.on.ca >>> atarnode35.atar.senecac.on.ca >>> atarnode35.atar.senecac.on.ca >>> atarnode34.atar.senecac.on.ca >>> atarnode34.atar.senecac.on.ca >>> PBS: qsub is running on master.atar.senecac.on.ca >>> PBS: originating queue is parallel >>> PBS: executing queue is parallel >>> PBS: working directory is /home/faculty/nilesh.mistry/pbs/multi/scaling_test >>> PBS: execution mode is PBS_BATCH >>> PBS: job identifier is 924.master.atar.senecac.on.ca >>> PBS: job name is scaling_test >>> PBS: node file is /var/spool/pbs/aux//924.master.atar.senecac.on.ca >>> PBS: current home directory is /home/faculty/nilesh.mistry >>> PBS: PATH = >>> /usr/kerberos/bin:/opt/lam-7.1.2/bin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin:/opt/env-switcher/bin:/opt/pvm3/lib:/opt/pvm3/lib/LINUX:/opt/pvm3/bi >>> n/LINUX:/opt/pbs/bin:/opt/pbs/lib/xpbs/bin:/opt/c3-4/:/home/faculty/nilesh.mistry/bin:/opt/maui/bin:/usr/lib/news/bin:/home/faculty/nilesh.mistry/scripts >>> ------------------------------------------------------ >>> Mesh 1 of 60 is alive on atarnode59.atar.senecac.on.ca >>> Mesh 17 of 60 is alive on atarnode50.atar.senecac.on.ca >>> Mesh 18 of 60 is alive on atarnode50.atar.senecac.on.ca >>> Mesh 3 of 60 is alive on atarnode57.atar.senecac.on.ca >>> Mesh 50 of 60 is alive on atarnode37.atar.senecac.on.ca >>> Mesh 51 of 60 is alive on atarnode37.atar.senecac.on.ca >>> Mesh 58 of 60 is alive on atarnode35.atar.senecac.on.ca >>> Mesh 15 of 60 is alive on atarnode51.atar.senecac.on.ca >>> Mesh 56 of 60 is alive on atarnode35.atar.senecac.on.ca >>> Mesh 47 of 60 is alive on atarnode38.atar.senecac.on.ca >>> Mesh 41 of 60 is alive on atarnode40.atar.senecac.on.ca >>> Mesh 43 of 60 is alive on atarnode40.atar.senecac.on.ca >>> Mesh 23 of 60 is alive on atarnode48.atar.senecac.on.ca >>> Mesh 59 of 60 is alive on atarnode34.atar.senecac.on.ca >>> Mesh 44 of 60 is alive on atarnode39.atar.senecac.on.ca >>> Mesh 60 of 60 is alive on atarnode34.atar.senecac.on.ca >>> Mesh 26 of 60 is alive on atarnode47.atar.senecac.on.ca >>> Mesh 46 of 60 is alive on atarnode39.atar.senecac.on.ca >>> Mesh 42 of 60 is alive on atarnode40.atar.senecac.on.ca >>> Mesh 32 of 60 is alive on atarnode44.atar.senecac.on.ca >>> Mesh 20 of 60 is alive on atarnode49.atar.senecac.on.ca >>> Mesh 35 of 60 is alive on atarnode42.atar.senecac.on.ca >>> Mesh 53 of 60 is alive on atarnode36.atar.senecac.on.ca >>> Mesh 22 of 60 is alive on atarnode49.atar.senecac.on.ca >>> Mesh 19 of 60 is alive on atarnode50.atar.senecac.on.ca >>> Mesh 48 of 60 is alive on atarnode38.atar.senecac.on.ca >>> Mesh 37 of 60 is alive on atarnode42.atar.senecac.on.ca >>> Mesh 54 of 60 is alive on atarnode36.atar.senecac.on.ca >>> Mesh 55 of 60 is alive on atarnode36.atar.senecac.on.ca >>> Mesh 45 of 60 is alive on atarnode39.atar.senecac.on.ca >>> Mesh 29 of 60 is alive on atarnode45.atar.senecac.on.ca >>> Mesh 24 of 60 is alive on atarnode48.atar.senecac.on.ca >>> Mesh 30 of 60 is alive on atarnode45.atar.senecac.on.ca >>> Mesh 31 of 60 is alive on atarnode45.atar.senecac.on.ca >>> Mesh 52 of 60 is alive on atarnode37.atar.senecac.on.ca >>> Mesh 28 of 60 is alive on atarnode47.atar.senecac.on.ca >>> Mesh 36 of 60 is alive on atarnode42.atar.senecac.on.ca >>> Mesh 34 of 60 is alive on atarnode44.atar.senecac.on.ca >>> Mesh 38 of 60 is alive on atarnode41.atar.senecac.on.ca >>> Mesh 40 of 60 is alive on atarnode41.atar.senecac.on.ca >>> Mesh 5 of 60 is alive on atarnode56.atar.senecac.on.ca >>> Mesh 57 of 60 is alive on atarnode35.atar.senecac.on.ca >>> Mesh 13 of 60 is alive on atarnode52.atar.senecac.on.ca >>> Mesh 9 of 60 is alive on atarnode54.atar.senecac.on.ca >>> Mesh 39 of 60 is alive on atarnode41.atar.senecac.on.ca >>> Mesh 7 of 60 is alive on atarnode55.atar.senecac.on.ca >>> Mesh 10 of 60 is alive on atarnode54.atar.senecac.on.ca >>> Mesh 8 of 60 is alive on atarnode55.atar.senecac.on.ca >>> Mesh 4 of 60 is alive on atarnode57.atar.senecac.on.ca >>> Mesh 6 of 60 is alive on atarnode56.atar.senecac.on.ca >>> Mesh 11 of 60 is alive on atarnode53.atar.senecac.on.ca >>> Mesh 14 of 60 is alive on atarnode52.atar.senecac.on.ca >>> Mesh 12 of 60 is alive on atarnode53.atar.senecac.on.ca >>> Mesh 21 of 60 is alive on atarnode49.atar.senecac.on.ca >>> Mesh 16 of 60 is alive on atarnode51.atar.senecac.on.ca >>> Mesh 33 of 60 is alive on atarnode44.atar.senecac.on.ca >>> Mesh 49 of 60 is alive on atarnode38.atar.senecac.on.ca >>> Mesh 25 of 60 is alive on atarnode48.atar.senecac.on.ca >>> Mesh 27 of 60 is alive on atarnode47.atar.senecac.on.ca >>> Mesh 2 of 60 is alive on atarnode59.atar.senecac.on.ca >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> ERROR: Number of meshes not equal to number of threads >>> ERROR: Number of meshes not equal to number of threads >>> >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> ERROR: Number of meshes not equal to number of threads >>> ERROR: Number of meshes not equal to number of threads >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> ERROR: Number of meshes not equal to number of threads >>> >>> LAM 7.1.2/MPI 2 C++/ROMIO - Indiana University >>> >>> >>> Thanks >>> >>> Nilesh Mistry >>> Academic Computing Services >>> [EMAIL PROTECTED] & TEL Campus >>> Seneca College Of Applies Arts & Technology >>> 70 The Pond Road >>> Toronto, Ontario >>> M3J 3M6 Canada >>> Phone 416 491 5050 ext 3788 >>> Fax 416 661 4695 >>> http://acs.senecac.on.ca >>> >>> >>> >>> Michael Edwards wrote: >>> >>> >>>> What do you get when you do "qstat -f" on the job? How many nodes is >>>> it actually getting? >>>> >>>> On 9/18/07, Nilesh Mistry <[EMAIL PROTECTED]> wrote: >>>> >>>> >>>> >>>>> Micheal >>>>> >>>>> We have actually moved to a larger cluster of 64 nodes (50 quad core and >>>>> 14 dual opterons), there fore 220 processors available. We are >>>>> submitting a job that requires 64 threads but still with the same >>>>> result. Here are the files you requested. I have already submitted to >>>>> torque user list. >>>>> >>>>> ####### PBS SCRIPT START####### >>>>> >>>>> #!/bin/sh -f >>>>> #PBS -l nodes=64 >>>>> #PBS -N scaling_test >>>>> #PBS -e scaling_test.err >>>>> #PBS -o scaling_test.log >>>>> #PBS -j oe >>>>> #PBS -l mem=64000mb >>>>> #PBS -m abe >>>>> #PBS -q parallel >>>>> >>>>> NCPU=`wc -l < $PBS_NODEFILE` >>>>> echo ------------------------------------------------------ >>>>> echo ' This job is allocated on '${NCPU}' cpu(s)' >>>>> echo 'Job is running on node(s): ' >>>>> cat $PBS_NODEFILE >>>>> echo PBS: qsub is running on $PBS_O_HOST >>>>> echo PBS: originating queue is $PBS_O_QUEUE >>>>> echo PBS: executing queue is $PBS_QUEUE >>>>> echo PBS: working directory is $PBS_O_WORKDIR >>>>> echo PBS: execution mode is $PBS_ENVIRONMENT >>>>> echo PBS: job identifier is $PBS_JOBID >>>>> echo PBS: job name is $PBS_JOBNAME >>>>> echo PBS: node file is $PBS_NODEFILE >>>>> echo PBS: current home directory is $PBS_O_HOME >>>>> echo PBS: PATH = $PBS_O_PATH >>>>> echo ------------------------------------------------------ >>>>> SERVER=$PBS_O_HOST >>>>> WORKDIR=$HOME/pbs/multi/scaling_test >>>>> cd ${WORKDIR} >>>>> cat $PBS_NODEFILE > nodes.list >>>>> lamboot -s -H $PBS_NODEFILE >>>>> mpirun -np $NCPU /opt/fds/fds5_mpi scaling_test.fds >>>>> lamhalt >>>>> >>>>> ####### PBS SCRIPT END ####### >>>>> >>>>> ####### MAUI.CFG START ####### >>>>> # maui.cfg 3.2.6p14 >>>>> >>>>> SERVERHOST master.atar.senecac.on.ca >>>>> # primary admin must be first in list >>>>> ADMIN1 root >>>>> ADMIN3 nilesh.mistry >>>>> >>>>> >>>>> # Resource Manager Definition >>>>> >>>>> RMCFG[master.atar.senecac.on.ca] TYPE=PBS >>>>> >>>>> # Allocation Manager Definition >>>>> >>>>> AMCFG[bank] TYPE=NONE >>>>> >>>>> # full parameter docs at >>>>> http://clusterresources.com/mauidocs/a.fparameters.html >>>>> # use the 'schedctl -l' command to display current configuration >>>>> >>>>> RMPOLLINTERVAL 00:01:00 >>>>> >>>>> SERVERPORT 42559 >>>>> SERVERMODE NORMAL >>>>> >>>>> # Admin: http://clusterresources.com/mauidocs/a.esecurity.html >>>>> >>>>> >>>>> LOGFILE maui.log >>>>> LOGFILEMAXSIZE 10000000 >>>>> LOGLEVEL 4 >>>>> LOGFACILITY fALL >>>>> >>>>> # Job Priority: >>>>> http://clusterresources.com/mauidocs/5.1jobprioritization.html >>>>> >>>>> QUEUETIMEWEIGHT 1 >>>>> >>>>> # FairShare: http://clusterresources.com/mauidocs/6.3fairshare.html >>>>> >>>>> #FSPOLICY PSDEDICATED >>>>> #FSDEPTH 7 >>>>> #FSINTERVAL 86400 >>>>> #FSDECAY 0.80 >>>>> >>>>> # Throttling Policies: >>>>> http://clusterresources.com/mauidocs/6.2throttlingpolicies.html >>>>> >>>>> # NONE SPECIFIED >>>>> >>>>> # Backfill: http://clusterresources.com/mauidocs/8.2backfill.html >>>>> >>>>> BACKFILLPOLICY ON >>>>> RESERVATIONPOLICY CURRENTHIGHEST >>>>> >>>>> # the following are modified/added by Mehrdad 13 Sept 07 >>>>> #NODEACCESSPOLICY DEDICATED >>>>> NODEACCESSPOLICY SHARED >>>>> JOBNODEMATCHPOLICY EXACTPROC >>>>> >>>>> # Node Allocation: >>>>> http://clusterresources.com/mauidocs/5.2nodeallocation.html >>>>> >>>>> NODEALLOCATIONPOLICY MINRESOURCE >>>>> >>>>> # QOS: http://clusterresources.com/mauidocs/7.3qos.html >>>>> >>>>> # QOSCFG[hi] PRIORITY=100 XFTARGET=100 FLAGS=PREEMPTOR:IGNMAXJOB >>>>> # QOSCFG[low] PRIORITY=-1000 FLAGS=PREEMPTEE >>>>> >>>>> # Standing Reservations: >>>>> http://clusterresources.com/mauidocs/7.1.3standingreservations.html >>>>> >>>>> # SRSTARTTIME[test] 8:00:00 >>>>> # SRENDTIME[test] 17:00:00 >>>>> # SRDAYS[test] MON TUE WED THU FRI >>>>> # SRTASKCOUNT[test] 20 >>>>> # SRMAXTIME[test] 0:30:00 >>>>> >>>>> # Creds: http://clusterresources.com/mauidocs/6.1fairnessoverview.html >>>>> >>>>> # USERCFG[DEFAULT] FSTARGET=25.0 >>>>> # USERCFG[john] PRIORITY=100 FSTARGET=10.0- >>>>> # GROUPCFG[staff] PRIORITY=1000 QLIST=hi:low QDEF=hi >>>>> # CLASSCFG[batch] FLAGS=PREEMPTEE >>>>> # CLASSCFG[interactive] FLAGS=PREEMPTOR >>>>> USERCFG[DEFAULT] MAXJOB=4 >>>>> ####### MAUI.CFG END ####### >>>>> >>>>> ####### QMGR -c "PRINT SERVER MASTER" ######## >>>>> # >>>>> # Create queues and set their attributes. >>>>> # >>>>> # >>>>> # Create and define queue serial >>>>> # >>>>> create queue serial >>>>> set queue serial queue_type = Execution >>>>> set queue serial resources_max.cput = 1000:00:00 >>>>> set queue serial resources_max.mem = 3000mb >>>>> set queue serial resources_max.ncpus = 1 >>>>> set queue serial resources_max.nodect = 1 >>>>> set queue serial resources_max.nodes = 1:ppn=1 >>>>> set queue serial resources_max.walltime = 1000:00:00 >>>>> set queue serial resources_default.cput = 336:00:00 >>>>> set queue serial resources_default.mem = 900mb >>>>> set queue serial resources_default.ncpus = 1 >>>>> set queue serial resources_default.nodect = 1 >>>>> set queue serial resources_default.nodes = 1:ppn=1 >>>>> set queue serial enabled = True >>>>> set queue serial started = True >>>>> # >>>>> # Create and define queue workq >>>>> # >>>>> create queue workq >>>>> set queue workq queue_type = Execution >>>>> set queue workq resources_max.cput = 10000:00:00 >>>>> set queue workq resources_max.ncpus = 200 >>>>> set queue workq resources_max.nodect = 64 >>>>> set queue workq resources_max.nodes = 200:ppn=4 >>>>> set queue workq resources_max.walltime = 10000:00:00 >>>>> set queue workq resources_min.cput = 00:00:01 >>>>> set queue workq resources_min.ncpus = 1 >>>>> set queue workq resources_min.nodect = 1 >>>>> set queue workq resources_min.walltime = 00:00:01 >>>>> set queue workq resources_default.cput = 10000:00:00 >>>>> set queue workq resources_default.nodect = 1 >>>>> set queue workq resources_default.walltime = 10000:00:00 >>>>> set queue workq enabled = True >>>>> set queue workq started = True >>>>> # >>>>> # Create and define queue parallel >>>>> # >>>>> create queue parallel >>>>> set queue parallel queue_type = Execution >>>>> set queue parallel resources_max.cput = 10000:00:00 >>>>> set queue parallel resources_max.ncpus = 200 >>>>> set queue parallel resources_max.nodect = 64 >>>>> set queue parallel resources_max.nodes = 200:ppn=4 >>>>> set queue parallel resources_max.walltime = 10000:00:00 >>>>> set queue parallel resources_min.ncpus = 1 >>>>> set queue parallel resources_min.nodect = 1 >>>>> set queue parallel resources_default.ncpus = 1 >>>>> set queue parallel resources_default.nodect = 1 >>>>> set queue parallel resources_default.nodes = 1:ppn=1 >>>>> set queue parallel resources_default.walltime = 10000:00:00 >>>>> set queue parallel enabled = True >>>>> set queue parallel started = True >>>>> # >>>>> # Set server attributes. >>>>> # >>>>> set server scheduling = True >>>>> set server acl_host_enable = False >>>>> set server acl_user_enable = False >>>>> set server default_queue = serial >>>>> set server log_events = 127 >>>>> set server mail_from = adm >>>>> set server query_other_jobs = True >>>>> set server resources_available.ncpus = 200 >>>>> set server resources_available.nodect = 64 >>>>> set server resources_available.nodes = 200 >>>>> set server resources_default.neednodes = 1 >>>>> set server resources_default.nodect = 1 >>>>> set server resources_default.nodes = 1 >>>>> set server resources_max.ncpus = 200 >>>>> set server resources_max.nodes = 200 >>>>> set server scheduler_iteration = 60 >>>>> set server node_check_rate = 150 >>>>> set server tcp_timeout = 6 >>>>> set server default_node = 1 >>>>> set server pbs_version = 2.0.0p8 >>>>> >>>>> >>>>> >>>>> Thanks >>>>> >>>>> Nilesh Mistry >>>>> Academic Computing Services >>>>> [EMAIL PROTECTED] & TEL Campus >>>>> Seneca College Of Applies Arts & Technology >>>>> 70 The Pond Road >>>>> Toronto, Ontario >>>>> M3J 3M6 Canada >>>>> Phone 416 491 5050 ext 3788 >>>>> Fax 416 661 4695 >>>>> http://acs.senecac.on.ca >>>>> >>>>> >>>>> >>>>> Michael Edwards wrote: >>>>> >>>>> >>>>> >>>>>> We'd need your script and the qsub command you used, possibly more >>>>>> configuration information from maui and torque, to be much help. >>>>>> >>>>>> I don't know that we have anyone who is deep with maui or torque right >>>>>> now, you might also want to ask on the maui or torque lists. >>>>>> >>>>>> >From the other posts you have made this error seems to be one of those >>>>>> general "Something is Broken" messages that could have many causes. >>>>>> >>>>>> On 9/17/07, Nilesh Mistry <[EMAIL PROTECTED]> wrote: >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> Hello >>>>>>> >>>>>>> I am having problems submitting job that requires 23 threads. I keep >>>>>>> getting the following error: >>>>>>> >>>>>>> ERROR: Number of meshes not equal to number of thread >>>>>>> >>>>>>> Hardware: >>>>>>> 10 quad core nodes (therefore 40 processors available) >>>>>>> >>>>>>> What do I need to insure in my job queue (qmgr) , maui (maui.cfg) and >>>>>>> my submit script when using qsub? >>>>>>> >>>>>>> Any and all help is greatly appreciated. >>>>>>> >>>>>>> -- >>>>>>> Thanks >>>>>>> >>>>>>> Nilesh Mistry >>>>>>> Academic Computing Services >>>>>>> [EMAIL PROTECTED] & TEL Campus >>>>>>> Seneca College Of Applies Arts & Technology >>>>>>> 70 The Pond Road >>>>>>> Toronto, Ontario >>>>>>> M3J 3M6 Canada >>>>>>> Phone 416 491 5050 ext 3788 >>>>>>> Fax 416 661 4695 >>>>>>> http://acs.senecac.on.ca >>>>>>> >>>>>>> >>>>>>> >>>>>>> ------------------------------------------------------------------------- >>>>>>> This SF.net email is sponsored by: Microsoft >>>>>>> Defy all challenges. Microsoft(R) Visual Studio 2005. >>>>>>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>>>>>> _______________________________________________ >>>>>>> Oscar-users mailing list >>>>>>> Oscar-users@lists.sourceforge.net >>>>>>> https://lists.sourceforge.net/lists/listinfo/oscar-users >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> ------------------------------------------------------------------------- >>>>>> This SF.net email is sponsored by: Microsoft >>>>>> Defy all challenges. Microsoft(R) Visual Studio 2005. >>>>>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>>>>> _______________________________________________ >>>>>> Oscar-users mailing list >>>>>> Oscar-users@lists.sourceforge.net >>>>>> https://lists.sourceforge.net/lists/listinfo/oscar-users >>>>>> >>>>>> >>>>>> >>>>>> >>>>> ------------------------------------------------------------------------- >>>>> This SF.net email is sponsored by: Microsoft >>>>> Defy all challenges. Microsoft(R) Visual Studio 2005. >>>>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>>>> _______________________________________________ >>>>> Oscar-users mailing list >>>>> Oscar-users@lists.sourceforge.net >>>>> https://lists.sourceforge.net/lists/listinfo/oscar-users >>>>> >>>>> >>>>> >>>>> >>>> ------------------------------------------------------------------------- >>>> This SF.net email is sponsored by: Microsoft >>>> Defy all challenges. Microsoft(R) Visual Studio 2005. >>>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>>> _______________________________________________ >>>> Oscar-users mailing list >>>> Oscar-users@lists.sourceforge.net >>>> https://lists.sourceforge.net/lists/listinfo/oscar-users >>>> >>>> >>>> >>> ------------------------------------------------------------------------- >>> This SF.net email is sponsored by: Microsoft >>> Defy all challenges. Microsoft(R) Visual Studio 2005. >>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>> _______________________________________________ >>> Oscar-users mailing list >>> Oscar-users@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/oscar-users >>> >>> >>> >> ------------------------------------------------------------------------- >> This SF.net email is sponsored by: Microsoft >> Defy all challenges. Microsoft(R) Visual Studio 2005. >> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >> _______________________________________________ >> Oscar-users mailing list >> Oscar-users@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/oscar-users >> >> > > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2005. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Oscar-users mailing list > Oscar-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/oscar-users > ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Oscar-users mailing list Oscar-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/oscar-users