Hi, the output of "diagnose -q" indicates that queue "inmetro" has a PROC limit of 232, but uses 360 procs at present. Is this true? (Your output doesn't show the queues the jobs are running in.) Is there something like
CLASSCFG[inmetro] MAXPROC=232 in your maui.cfg? That would explain it. Or did you set queue limits in torque? Regards, Burkhard Bunk. ---------------------------------------------------------------------- [email protected] Physics Institute, Humboldt University fax: ++49-30 2093 7628 Newtonstr. 15 phone: ++49-30 2093 7980 12489 Berlin, Germany ---------------------------------------------------------------------- On Fri, 6 Sep 2013, Denis wrote:
I cant figure out why some jobs are getting block for a soft limit while there are no other jobs queued. Can some one shed some light on this? Diagnosing blocked jobs (policylevel SOFT partition ALL) diagnose -q job 712 violates active SOFT MAXPROC limit of 232 for class inmetro (R: 40, U: 360) job 713 violates active SOFT MAXPROC limit of 232 for class inmetro (R: 40, U: 360) job 714 violates active SOFT MAXPROC limit of 232 for class inmetro (R: 40, U: 360) job 715 violates active SOFT MAXPROC limit of 232 for class inmetro (R: 40, U: 360) job 716 violates active SOFT MAXPROC limit of 232 for class inmetro (R: 40, U: 360) job 717 violates active SOFT MAXPROC limit of 232 for class inmetro (R: 40, U: 360) job 718 violates active SOFT MAXPROC limit of 232 for class inmetro (R: 40, U: 360) job 719 violates active SOFT MAXPROC limit of 232 for class inmetro (R: 40, U: 360) dirac:/usr/local/maui/bin # showq ACTIVE JOBS-------------------- JOBNAME USERNAME STATE PROC REMAINING STARTTIME 528 marcos Running 8 1:19:03:54 Thu Sep 5 18:26:59 702 versatus Running 40 99:23:50:10 Fri Sep 6 11:13:16 703 versatus Running 40 99:23:50:14 Fri Sep 6 11:13:20 704 versatus Running 40 99:23:50:44 Fri Sep 6 11:13:50 705 versatus Running 40 99:23:50:45 Fri Sep 6 11:13:51 706 versatus Running 40 99:23:50:45 Fri Sep 6 11:13:51 707 versatus Running 40 99:23:50:46 Fri Sep 6 11:13:52 708 versatus Running 40 99:23:50:46 Fri Sep 6 11:13:52 709 versatus Running 40 99:23:50:47 Fri Sep 6 11:13:53 710 versatus Running 40 99:23:50:47 Fri Sep 6 11:13:53 711 versatus Running 40 99:23:54:28 Fri Sep 6 11:17:34 11 Active Jobs 408 of 440 Processors Active (92.73%) 27 of 27 Nodes Active (100.00%) IDLE JOBS---------------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME 0 Idle Jobs BLOCKED JOBS---------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME 712 versatus Idle 40 99:23:59:59 Fri Sep 6 11:13:53 713 versatus Idle 40 99:23:59:59 Fri Sep 6 11:13:53 714 versatus Idle 40 99:23:59:59 Fri Sep 6 11:13:54 715 versatus Idle 40 99:23:59:59 Fri Sep 6 11:13:54 716 versatus Idle 40 99:23:59:59 Fri Sep 6 11:13:55 717 versatus Idle 40 99:23:59:59 Fri Sep 6 11:13:55 718 versatus Idle 40 99:23:59:59 Fri Sep 6 11:13:55 719 versatus Idle 40 99:23:59:59 Fri Sep 6 11:13:56 Total Jobs: 19 Active Jobs: 11 Idle Jobs: 0 Blocked Jobs: 8 dirac:/usr/local/maui/bin # showres Reservations ReservationID Type S Start End Duration N/P StartTime 528 Job R -16:57:20 1:19:02:40 2:12:00:00 1/8 Thu Sep 5 18:26:59 707 Job R -00:10:27 99:23:49:32 99:23:59:59 5/40 Fri Sep 6 11:13:52 709 Job R -00:10:26 99:23:49:33 99:23:59:59 5/40 Fri Sep 6 11:13:53 711 Job R -00:06:45 99:23:53:14 99:23:59:59 5/40 Fri Sep 6 11:17:34 712 Job R -00:00:36 99:23:59:23 99:23:59:59 5/40 Fri Sep 6 11:23:43 713 Job R -00:00:36 99:23:59:23 99:23:59:59 5/40 Fri Sep 6 11:23:43 714 Job R 00:00:00 99:23:59:59 99:23:59:59 5/40 Fri Sep 6 11:24:19 715 Job R 00:00:00 99:23:59:59 99:23:59:59 5/40 Fri Sep 6 11:24:19 716 Job R 00:00:00 99:23:59:59 99:23:59:59 5/40 Fri Sep 6 11:24:19 717 Job R 00:00:00 99:23:59:59 99:23:59:59 5/40 Fri Sep 6 11:24:19 718 Job R 00:00:00 99:23:59:59 99:23:59:59 5/40 Fri Sep 6 11:24:19 11 reservations located dirac:/usr/local/maui/bin # showstate cluster state summary for Fri Sep 6 11:25:00 JobName S User Group Procs Remaining StartTime ------------------ - --------- -------- ----- ----------- ------------------- (A) 528 R marcos users 8 1:19:01:59 Thu Sep 5 18:26:59 (B) 711 R versatus inmetro 40 99:23:52:33 Fri Sep 6 11:17:34 (C) 712 R versatus inmetro 40 99:23:58:42 Fri Sep 6 11:23:43 (D) 713 R versatus inmetro 40 99:23:58:42 Fri Sep 6 11:23:43 (E) 714 R versatus inmetro 40 99:23:59:18 Fri Sep 6 11:24:19 (F) 715 R versatus inmetro 40 99:23:59:18 Fri Sep 6 11:24:19 (G) 716 R versatus inmetro 40 99:23:59:18 Fri Sep 6 11:24:19 (H) 717 R versatus inmetro 40 99:23:59:18 Fri Sep 6 11:24:19 (I) 718 R versatus inmetro 40 99:23:59:18 Fri Sep 6 11:24:19 (J) 719 R versatus inmetro 40 99:23:59:59 Fri Sep 6 11:25:00 usage summary: 10 active jobs 46 active nodes [0] [1] Frame 01: [J] Key: [?]:Unknown [*]:Down w/Job [#]:Down [ ]:Idle [@] Busy w/No Job [!] Drained thanks in advance -- D
_______________________________________________ mauiusers mailing list [email protected] http://www.supercluster.org/mailman/listinfo/mauiusers
