Marvin,
We use Maui 3.3 and Torque 2.5.4, and our maui config looks like yours (except we have NODEALLOCATIONPOLICY set to PRIORITY). Your first qsub asks for 12 processors on one node and 1 processor on one node, so 2 nodes in total and 13 processors. Your second asks for 12 processors on each of 3 nodes (36 total) and one processor on one node, so 4 nodes and 37 processors. How many nodes and processors do you have according to showq? You also noted that $ qsub -l nodes=4:ppn=12+1:ppn=1 worked, which I find strange as this requires 49 processors and 5 nodes. Any other processor or node restrictions in your torque or maui config? Peter From: Marvin Novaglobal [mailto:[email protected]] Sent: Thursday, March 24, 2011 10:56 PM To: Peter Michael Crosta Cc: [email protected] Subject: Re: [Mauiusers] Multiple job request peculiarities Sorry, I just had a look at my original post again. The description missed a '+' sign there but in my actual testing I have a '+' sign. Therefore, qsub -l nodes=1:ppn=12+1:ppn=1 (works) while qsub -l nodes=3:ppn=12+1:ppn=1 (does not work, job goes to idle) Weird stuff. May I know if you guys encounter this? Regards, Marvin On Fri, Mar 25, 2011 at 10:46 AM, Marvin Novaglobal <[email protected]> wrote: Hi Peter, It doesn't work for my setup. I meant it only applies to nodes=3 and nodes=5 so far. We don't have enough resources to test on nodes=7. So again, qsub -l nodes=1:ppn=12+1:ppn=1 will work but qsub -l nodes=3:ppn=12+1:ppn=1 will not work May I know which version of Maui and Torque you are using? Your Maui and Torque's config also please. Regards, Marvin On Fri, Mar 25, 2011 at 12:20 AM, Peter Michael Crosta <[email protected]> wrote: Hi Marvin, I have gotten multiple resource requests to work by using the "+" sign. Have you tried qsub -l nodes=3:ppn=12+1:ppn=1 ? Best, Peter On Thu, 24 Mar 2011, Marvin Novaglobal wrote: Hi, On my setup, $ qsub -l nodes=1:ppn=12:1:ppn=1 (works) $ qsub -l nodes=2:ppn=12:1:ppn=1 (works) $ qsub -l nodes=3:ppn=12:1:ppn=1 (job goes to idle and never get executed) $ qsub -l nodes=4:ppn=12:1:ppn=1 (works) $ qsub -l nodes=5:ppn=12:1:ppn=1 (job goes to idle and never get executed) <Maui.cfg> ... ENABLEMULTINODEJOBS[0] TRUE ENABLEMULTIREQJOBS[0] TRUE JOBNODEMATCHPOLICY[0] EXACTNODE NODEALLOCATIONPOLICY[0] MINRESOURCE <Torque.cfg> set server scheduling = True set server acl_hosts = aquarius.local set server managers = torque@aquarius set server operators = torque@aquarius set server default_queue = DEFAULT set server log_events = 511 set server mail_from = adm set server resources_available.nodect = 2048 set server scheduler_iteration = 600 set server node_check_rate = 150 set server tcp_timeout = 6 set server mom_job_sync = True set server keep_completed = 300 set server next_job_number = 377 <maui.log> 03/24 20:23:48 MResDestroy(377) 03/24 20:23:48 MResChargeAllocation(377,2) 03/24 20:23:48 MQueueSelectJobs(SrcQ,DstQ,SOFT,5120,4096,2140000000,EVERY,FReason,TRUE) 03/24 20:23:48 INFO: total jobs selected in partition ALL: 1/1 03/24 20:23:48 MQueueSelectJobs(SrcQ,DstQ,SOFT,5120,4096,2140000000,DEFAULT,FReason,TRUE) 03/24 20:23:48 INFO: total jobs selected in partition DEFAULT: 1/1 03/24 20:23:48 MQueueScheduleIJobs(Q,DEFAULT) 03/24 20:23:48 INFO: 72 feasible tasks found for job 377:0 in partition DEFAULT (36 Needed) 03/24 20:23:48 INFO: 72 feasible tasks found for job 377:1 in partition DEFAULT (1 Needed) 03/24 20:23:48 ALERT: inadequate tasks to allocate to job 377:1 (0 < 1) 03/24 20:23:48 ERROR: cannot allocate nodes to job '377' in partition DEFAULT 03/24 20:23:48 MJobPReserve(377,DEFAULT,ResCount,ResCountRej) 03/24 20:23:48 MJobReserve(377,Priority) 03/24 20:23:48 INFO: 72 feasible tasks found for job 377:0 in partition DEFAULT (36 Needed) 03/24 20:23:48 INFO: 72 feasible tasks found for job 377:1 in partition DEFAULT (1 Needed) 03/24 20:23:48 INFO: 72 feasible tasks found for job 377:0 in partition DEFAULT (36 Needed) 03/24 20:23:48 INFO: 72 feasible tasks found for job 377:1 in partition DEFAULT (1 Needed) 03/24 20:23:48 INFO: located resources for 36 tasks (144) in best partition DEFAULT for job 377 at time 00:00:01 03/24 20:23:48 INFO: tasks located for job 377: 37 of 36 required (144 feasible) 03/24 20:23:48 MResJCreate(377,MNodeList,00:00:01,Priority,Res) 03/24 20:23:48 INFO: job '377' reserved 36 tasks (partition DEFAULT) to start in 00:00:01 on Thu Mar 24 20:23:49 (WC: 2592000) <pbs_server.log> 03/24/2011 20:23:17;0100;PBS_Server;Job;377.aquarius;enqueuing into DEFAULT, state 1 hop 1 03/24/2011 20:23:17;0008;PBS_Server;Job;377.aquarius;Job Queued at request of torque@aquarius, owner = torque@aquarius, job name = parallel.sh, queue = DEFAULT 03/24/2011 20:23:17;0040;PBS_Server;Svr;aquarius;Scheduler was sent the command new Anyone encounter problem with multiple job requests? Regards, Marvin
_______________________________________________ mauiusers mailing list [email protected] http://www.supercluster.org/mailman/listinfo/mauiusers
