Fernando, This may be merely by design. When a job is queued, whatever resources are available that it may need but are not yet used are reserved. So if it needs 12 cores on a 16 core machine but there is an 8 core job running there, it will reserve the remaining 8 while it waits for the other 4 it needs to be freed up.
Now IF there is another 8 core job that is submitted, it has to wait. UNLESS it can run on those 8 cores and be done before the other 8 core job completes. Maui can 'squeeze' it in without affecting the start time of that 12 core job. So it could be you are seeing the resources being reserved for the 12 core job because the smaller jobs could not be run without bumping the soonest start time of the 12 core job. Brian Andrus ITACS/Research Computing Naval Postgraduate School Monterey, California voice: 831-656-6238 From: [email protected] [mailto:[email protected]] On Behalf Of Fernando Caba Sent: Tuesday, June 23, 2015 4:04 PM To: Torque Users Mailing List; mauiusers Subject: [Mauiusers] Queueing jobs in inappropriate order Hi All, in my cluster the users run jobs in one node with different quantity of processors (nodes=1:ppn= 4, 8 or 12) For some reason, the jobs are queued besides resources are available. For example, a job requiring 12 cores becomes queued and several nodes have 8 cores free (we have 8 nodes and each node have 12 cores). If the users submit new jobs with 4 cores o 8 cores, those jobs don´t run, becomes queued in spite of the available resources. Here is my maui.cfg: # maui.cfg 3.3.1 SERVERHOST fe # primary admin must be first in list ADMIN1 root # Resource Manager Definition RMCFG[FE] TYPE=PBS # Allocation Manager Definition AMCFG[bank] TYPE=NONE # full parameter docs at http://supercluster.org/mauidocs/a.fparameters.html # use the 'schedctl -l' command to display current configuration RMPOLLINTERVAL 00:00:30 SERVERPORT 42559 SERVERMODE NORMAL # Admin: http://supercluster.org/mauidocs/a.esecurity.html LOGFILE maui.log LOGFILEMAXSIZE 10000000 LOGLEVEL 3 # Job Priority: http://supercluster.org/mauidocs/5.1jobprioritization.html QUEUETIMEWEIGHT 1 # FairShare: http://supercluster.org/mauidocs/6.3fairshare.html #FSPOLICY PSDEDICATED #FSDEPTH 7 #FSINTERVAL 86400 #FSDECAY 0.80 # Throttling Policies: http://supercluster.org/mauidocs/6.2throttlingpolicies.html # NONE SPECIFIED # Backfill: http://supercluster.org/mauidocs/8.2backfill.html BACKFILLPOLICY FIRSTFIT RESERVATIONPOLICY CURRENTHIGHEST # Node Allocation: http://supercluster.org/mauidocs/5.2nodeallocation.html NODEALLOCATIONPOLICY MINRESOURCE #NODEALLOCATIONPOLICY FIRSTAVAILABLE # QOS: http://supercluster.org/mauidocs/7.3qos.html # QOSCFG[hi] PRIORITY=100 XFTARGET=100 FLAGS=PREEMPTOR:IGNMAXJOB # QOSCFG[low] PRIORITY=-1000 FLAGS=PREEMPTEE # Standing Reservations: http://supercluster.org/mauidocs/7.1.3standingreservations.html # SRSTARTTIME[test] 8:00:00 # SRENDTIME[test] 17:00:00 # SRDAYS[test] MON TUE WED THU FRI # SRTASKCOUNT[test] 20 # SRMAXTIME[test] 0:30:00 # Creds: http://supercluster.org/mauidocs/6.1fairnessoverview.html # USERCFG[DEFAULT] FSTARGET=25.0 # USERCFG[john] PRIORITY=100 FSTARGET=10.0- # GROUPCFG[staff] PRIORITY=1000 QLIST=hi:low QDEF=hi # CLASSCFG[batch] FLAGS=PREEMPTEE # CLASSCFG[interactive] FLAGS=PREEMPTOR CLASSCFG[batch] MAXPROCPERUSER=12 JOBNODEMATCHPOLICY EXACTPROC #JOBNODEMATCHPOLICY EXACTNODE and here is my torque configuration: # # Create queues and set their attributes. # # # Create and define queue batch # create queue batch set queue batch queue_type = Execution set queue batch resources_default.nodes = 8 set queue batch resources_default.walltime = 4800:00:00 set queue batch enabled = True set queue batch started = True # # Set server attributes. # set server scheduling = True set server acl_hosts = fe set server managers = root@fe set server operators = root@fe set server default_queue = batch set server log_events = 511 set server scheduler_iteration = 600 set server node_check_rate = 150 set server tcp_timeout = 6 set server log_level = 7 set server mom_job_sync = True set server keep_completed = 300 set server auto_node_np = True set server next_job_number = 10422 set server record_job_info = True set server record_job_script = True So, i was thinking about the creation of different queue, one for 4 cores jobs, another one for 8 cores jobs and another one for 12 cores jobs. Is this a reasonable policy, forcing the exact quantity of cores in each job per corresponding queue (for 4, 8 or 12 cores per job)? Thanks in advance!! Fernando -- [cid:[email protected]] <http://www.uns.edu.ar>Universidad Nacional del Sur Mg. Fernando Caba Director General de Telecomunicaciones Avda. Alem 1253, (B8000CPB) Bahía Blanca - Argentina Tel/Fax: (54)-291-4595166 Tel: (54)-291-4595101 int. 2050 http://www.dgt.uns.edu.ar
_______________________________________________ mauiusers mailing list [email protected] http://www.supercluster.org/mailman/listinfo/mauiusers
