The users always launch their jobs with nodes=1:ppn=4.
Thanks Fernando El 08/10/2013 07:03 PM, Gus Correa escribió:
Hi Fernando This is unlikely, but is there any chance that, due to fragmentation over time, the 4 idle processors exist in separate nodes? You can check that with qtop (or pbsnodes). Do jobs 5610 and 5613 request processors in a single node? Say: #PBS -l nodes=1:ppn=4 . If so, I wonder if Maui+Torque would start the job at all, in case the 4 idle processors straddle more than one node. In this case, maybe the user could try to launch the jobs with #PBS -l nodes=4:ppn=1+ppn=1+ppn=1+ppn=1, instead? Just a guess, Gus Correa On 10/08/2013 10:29 AM, Fernando Caba wrote:Hi All, i have a cluster with maui 3.3.1 and torque 3.0.1. Cluster works fine. But, the next situation recently is very often. Having free processors (4 in this case), whyjobs 5610 and 5613 (4 processors each), don´t ingress in the the queue. It´s obviously that i have freee resources (4 processors), the users claims for execution deferred (or idle) [root@cluster]# showq ACTIVE JOBS-------------------- JOBNAME USERNAME STATE PROC REMAINING STARTTIME 5410 ricardo Running 4 INFINITY Mon Sep 23 11:35:22 5463 sol Running 4 INFINITY Fri Sep 27 19:11:06 5464 sol Running 4 INFINITY Fri Sep 27 19:11:24 5535 carolina Running 4 INFINITY Tue Oct 1 11:07:33 5536 nicolas Running 4 INFINITY Tue Oct 1 12:46:10 5559 mmarta Running 4 INFINITY Fri Oct 4 10:06:34 5560 mmarta Running 4 INFINITY Fri Oct 4 10:09:23 5568 leandro Running 4 INFINITY Fri Oct 4 15:57:36 5579 silvia Running 12 INFINITY Sat Oct 5 22:40:39 5586 carolina Running 4 INFINITY Sun Oct 6 18:36:15 5588 walter Running 4 INFINITY Sun Oct 6 23:35:03 5592 patricia Running 4 INFINITY Mon Oct 7 09:03:27 5594 mmarta Running 4 INFINITY Mon Oct 7 11:00:57 5600 ricardo Running 4 INFINITY Mon Oct 7 17:50:09 5601 carolina Running 4 INFINITY Mon Oct 7 18:28:48 5602 nicolas Running 4 INFINITY Mon Oct 7 20:43:33 5605 nicolas Running 4 INFINITY Tue Oct 8 00:14:51 5606 gabriela Running 4 INFINITY Tue Oct 8 06:17:47 5607 gabriela Running 4 INFINITY Tue Oct 8 06:21:50 5608 gabriela Running 4 INFINITY Tue Oct 8 06:25:26 5609 cecilia Running 4 INFINITY Tue Oct 8 06:29:13 21 Active Jobs 92 of 96 Processors Active (95.83%) 8 of 8 Nodes Active (100.00%) IDLE JOBS---------------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME 0 Idle Jobs BLOCKED JOBS---------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME 5584 patricia Deferred 12 INFINITY Sun Oct 6 11:02:55 5610 cecilia Deferred 4 INFINITY Tue Oct 8 06:30:26 5613 patricia Deferred 4 INFINITY Tue Oct 8 11:06:19 Total Jobs: 24 Active Jobs: 21 Idle Jobs: 0 Blocked Jobs: 3 [root@cluster]# this is my maui.cfg file: # maui.cfg 3.3.1 SERVERHOST fe #SERVERHOST castellani13.fisica.uns.edu.ar # primary admin must be first in list ADMIN1 root # Resource Manager Definition RMCFG[FE] TYPE=PBS # Allocation Manager Definition AMCFG[bank] TYPE=NONE # full parameter docs at http://supercluster.org/mauidocs/a.fparameters.html # use the 'schedctl -l' command to display current configuration RMPOLLINTERVAL 00:00:30 SERVERPORT 42559 SERVERMODE NORMAL # Admin: http://supercluster.org/mauidocs/a.esecurity.html LOGFILE maui.log LOGFILEMAXSIZE 10000000 LOGLEVEL 3 # Job Priority: http://supercluster.org/mauidocs/5.1jobprioritization.html QUEUETIMEWEIGHT 1 # FairShare: http://supercluster.org/mauidocs/6.3fairshare.html #FSPOLICY PSDEDICATED #FSDEPTH 7 #FSINTERVAL 86400 #FSDECAY 0.80 # Throttling Policies: http://supercluster.org/mauidocs/6.2throttlingpolicies.html # NONE SPECIFIED # Backfill: http://supercluster.org/mauidocs/8.2backfill.html BACKFILLPOLICY FIRSTFIT RESERVATIONPOLICY CURRENTHIGHEST # Node Allocation: http://supercluster.org/mauidocs/5.2nodeallocation.html NODEALLOCATIONPOLICY MINRESOURCE # QOS: http://supercluster.org/mauidocs/7.3qos.html # QOSCFG[hi] PRIORITY=100 XFTARGET=100 FLAGS=PREEMPTOR:IGNMAXJOB # QOSCFG[low] PRIORITY=-1000 FLAGS=PREEMPTEE # Standing Reservations: http://supercluster.org/mauidocs/7.1.3standingreservations.html # SRSTARTTIME[test] 8:00:00 # SRENDTIME[test] 17:00:00 # SRDAYS[test] MON TUE WED THU FRI # SRTASKCOUNT[test] 20 # SRMAXTIME[test] 0:30:00 # Creds: http://supercluster.org/mauidocs/6.1fairnessoverview.html # USERCFG[DEFAULT] FSTARGET=25.0 # USERCFG[john] PRIORITY=100 FSTARGET=10.0- # GROUPCFG[staff] PRIORITY=1000 QLIST=hi:low QDEF=hi # CLASSCFG[batch] FLAGS=PREEMPTEE # CLASSCFG[interactive] FLAGS=PREEMPTOR JOBNODEMATCHPOLICY EXACTPROC #JOBNODEMATCHPOLICY EXACTNODE # estaba esta linea cuando quedaban procs ociosos [root@fe maui]# vi maui.cfg # maui.cfg 3.3.1 SERVERHOST fe #SERVERHOST castellani13.fisica.uns.edu.ar # primary admin must be first in list ADMIN1 root # Resource Manager Definition RMCFG[FE] TYPE=PBS # Allocation Manager Definition AMCFG[bank] TYPE=NONE # full parameter docs at http://supercluster.org/mauidocs/a.fparameters.html # use the 'schedctl -l' command to display current configuration RMPOLLINTERVAL 00:00:30 SERVERPORT 42559 "maui.cfg" 80L, 2096C # QOSCFG[hi] PRIORITY=100 XFTARGET=100 FLAGS=PREEMPTOR:IGNMAXJOB # QOSCFG[low] PRIORITY=-1000 FLAGS=PREEMPTEE # Standing Reservations: http://supercluster.org/mauidocs/7.1.3standingreservations.html # SRSTARTTIME[test] 8:00:00 # SRENDTIME[test] 17:00:00 # SRDAYS[test] MON TUE WED THU FRI # SRTASKCOUNT[test] 20 # SRMAXTIME[test] 0:30:00 # Creds: http://supercluster.org/mauidocs/6.1fairnessoverview.html # USERCFG[DEFAULT] FSTARGET=25.0 # USERCFG[john] PRIORITY=100 FSTARGET=10.0- # GROUPCFG[staff] PRIORITY=1000 QLIST=hi:low QDEF=hi # CLASSCFG[batch] FLAGS=PREEMPTEE # CLASSCFG[interactive] FLAGS=PREEMPTOR JOBNODEMATCHPOLICY EXACTPROC #JOBNODEMATCHPOLICY EXACTNODE ~ ------------------------------------------------ And this is my queue configuration: # # Create queues and set their attributes. # # # Create and define queue batch # create queue batch set queue batch queue_type = Execution set queue batch resources_default.nodes = 4 set queue batch resources_default.walltime = 4800:00:00 set queue batch enabled = True set queue batch started = True # # Set server attributes. # set server scheduling = True set server acl_hosts = fe set server managers = root@fe set server operators = root@fe set server default_queue = batch set server log_events = 511 set server mail_from = grumasica set server scheduler_iteration = 600 set server node_check_rate = 150 set server tcp_timeout = 6 set server log_level = 7 set server mom_job_sync = True set server mail_domain = uns.edu.ar set server keep_completed = 300 set server auto_node_np = True set server next_job_number = 5614 set server record_job_info = True set server record_job_script = True ---------------------------------- Thanks!!! -- * <http://www.uns.edu.ar> Universidad Nacional del Sur * Ing. Fernando Caba Director General de Telecomunicaciones Avda. Alem 1253, (B8000CPB) Bahía Blanca - Argentina Tel/Fax: (54)-291-4595166 Tel: (54)-291-4595101 int. 2050 http://www.dgt.uns.edu.ar _______________________________________________ mauiusers mailing list mauiusers@supercluster.org http://www.supercluster.org/mailman/listinfo/mauiusers_______________________________________________ mauiusers mailing list mauiusers@supercluster.org http://www.supercluster.org/mailman/listinfo/mauiusers
smime.p7s
Description: Firma criptográfica S/MIME
_______________________________________________ mauiusers mailing list mauiusers@supercluster.org http://www.supercluster.org/mailman/listinfo/mauiusers