I've just applied the patch you suggested ( http://www.clusterresources.com/pipermail/mauiusers/2006-February/002009.html ) and all seems work fine, now! Thank you very much! Francesco
etienne gondet ha scritto: > > Apparently there is a BUG in fairness with patch 14 > doubling the real usage of procs or jobs, > > USERCFG[DEFAULT] MAXJOB=3 > > so with 2 serial jobs you probably violate thois hard limit. > > There is a patch : > or just com back to p13 . > > http://www.clusterresources.com/pipermail/mauiusers/2006-February/002009.html > > > Etienne Gondet. > > Francesco Del Citto a écrit: > >> Hi! >> I have a problem with "HARD MAXJOB LIMIT", using maui 3.2.6p14 and >> torque 2.0.0p7 >> In maui.cfg I have: >> >> USERCFG[DEFAULT] MAXPROC=8,11 >> USERCFG[DEFAULT] MAXJOB=3 >> USERCFG[DEFAULT] FSTARGET=25.0 >> USERCFG[DEFAULT] PRIORITY=1000 >> GROUPCFG[DEFAULT] MAXPROC=8,11 >> GROUPCFG[DEFAULT] MAXJOB=6 >> GROUPCFG[DEFAULT] FSTARGET=25.0 >> USERCFG[DEFAULT] PRIORITY=1000 >> >> GROUPCFG[kiva] MAXJOB=8 >> GROUPCFG[kiva] MAXPROC=11 >> GROUPCFG[kiva] PRIORITY=5000 >> >> and I'm trying to run 3 jobs. 2 serial jobs and 1 parallel job (2 >> processors), >> submiting them in this order (first the serials, then the parallel one). >> The cluster is completely free, now. This is what I get with showq and >> checkjob: >> >> ------------------------------------------------------------------------------ >> >> [EMAIL PROTECTED] run]$ showq >> ACTIVE JOBS-------------------- >> JOBNAME USERNAME STATE PROC REMAINING >> STARTTIME >> >> 2046 francesco Running 1 99:23:35:36 Tue Feb 14 >> 09:23:42 >> 2047 francesco Running 1 99:23:51:22 Tue Feb 14 >> 09:39:28 >> >> 2 Active Jobs 2 of 9 Processors Active (22.22%) >> >> IDLE JOBS---------------------- >> JOBNAME USERNAME STATE PROC WCLIMIT >> QUEUETIME >> >> >> 0 Idle Jobs >> >> BLOCKED JOBS---------------- >> JOBNAME USERNAME STATE PROC WCLIMIT >> QUEUETIME >> >> 2048 francesco Idle 2 99:23:59:59 Tue Feb 14 >> 09:40:21 >> >> Total Jobs: 3 Active Jobs: 2 Idle Jobs: 0 Blocked Jobs: 1 >> ------------------------------------------------------------------------------ >> >> >> ------------------------------------------------------------------------------ >> >> [EMAIL PROTECTED] run]$ checkjob 2048 >> >> >> checking job 2048 >> >> State: Idle >> Creds: user:francesco group:kiva class:batch qos:DEFAULT >> WallTime: 00:00:00 of 99:23:59:59 >> SubmitTime: Tue Feb 14 09:40:21 >> (Time Queued Total: 00:08:39 Eligible: 00:00:00) >> >> Total Tasks: 2 >> >> Req[0] TaskCount: 2 Partition: ALL >> Network: [NONE] Memory >= 0 Disk >= 0 Swap >= 0 >> Opsys: [NONE] Arch: [NONE] Features: [new] >> NodeCount: 2 >> >> >> IWD: [NONE] Executable: [NONE] >> Bypass: 0 StartCount: 0 >> PartitionMask: [ALL] >> Flags: RESTARTABLE >> >> PE: 2.00 StartPriority: 6725 >> cannot select job 2048 for partition DEFAULT (job 2048 violates >> active HARD >> MAXJOB limit of 3 for user francesco (R: 1, U: 4) >> ) >> ------------------------------------------------------------------------------ >> >> >> Is it a bug or a misconfiguration? >> >> Francesco >> _______________________________________________ >> mauiusers mailing list >> [email protected] >> http://www.supercluster.org/mailman/listinfo/mauiusers >> >> >> >> >> >> >> > > > _______________________________________________ mauiusers mailing list [email protected] http://www.supercluster.org/mailman/listinfo/mauiusers
