On Mon, 15 Nov 2010 13:57:57 -0200 Denis Denis wrote: Hi,
> > > Could you send you maui.cfg? > > Sure (I've added a couple of node bewteen lines). > > > > > > SERVERHOST NAME > > ADMIN1 root > > ADMIN3 edginfo rgma edguser monami > > ADMINHOST NAME > > RMCFG[base] TYPE=PBS TIMEOUT=30 > > SERVERPORT 40559 > > SERVERMODE NORMAL > > > > RMPOLLINTERVAL 00:02:00 > > LOGFILE /var/log/maui.log > > LOGFILEMAXSIZE 50000000 > > > > IDLEJOBDEPTH 300 > > #This come from a patch > > #http://www.supercluster.org/pipermail/mauiusers/2009-February/003746.html > > > > > > > > BACKFILLPOLICY NONE > > BACKFILLDEPTH 1 > > LOGLEVEL 1 > > > > LOGFILEROLLDEPTH 50 > > > > ENABLENEGJOBPRIORITY true > > REJECTNEGPRIOJOBS false > > > > QUEUETIMEWEIGHT 0 > > > > XFACTORWEIGHT 0 > > > > > > CREDWEIGHT 1 > > GROUPWEIGHT 1 > > USERWEIGHT 1 > > CLASSWEIGHT 1 > > > > NODEALLOCATIONPOLICY CPULOAD > > > > DEFERTIME 00:00:00 > > > > CLASSCFG[long] MAXPROC=100 > > CLASSCFG[medium] MAXPROC=100 > > GROUPCFG[dteam] MAXPROC=40 PRIORITY=10 > > GROUPCFG[dtsgm] MAXPROC=2 PRIORITY=100000 > > GROUPCFG[dtprd] MAXPROC=20 PRIORITY=100000 > > GROUPCFG[ops] MAXPROC=20 PRIORITY=100000 > > GROUPCFG[pilotops] MAXPROC=20 PRIORITY=100000 > > USERCFG[arnaubria] PRIORITY=1000 > > > > SRCFG[picsgm_64] > > > > GROUPLIST=atsgm,sgmcm,lhsgm,masgm,ctasgm,dtsgm,misgm,pasgm,picvosgm,sgmibergrid > > SRCFG[picsgm_64] RESOURCES=PROCS:4 > > SRCFG[picsgm_64] PRIORITY=1000 > > SRCFG[picsgm_64] HOSTLIST=tditaller021 > > SRCFG[picsgm_64] STARTTIME=0:00:00 ENDTIME=24:00:00 > > SRCFG[picsgm_64] PERIOD=INFINITY > > > > FSWEIGHT 1 > > FSUSERWEIGHT 2 > > FSGROUPWEIGHT 10 > > FSQOSWEIGHT 100 > > > > FSDEPTH 4 > > FSINTERVAL 12:00:00 > > FSDECAY 0.5 > > FSPOLICY DEDICATEDPS% > > > > > > > > GROUPCFG[masgm] FSTARGET=10 QDEF=magic MAXPROC=2 > > GROUPCFG[maprd] FSTARGET=10 QDEF=magic > > GROUPCFG[magic] FSTARGET=10 QDEF=magic > > QOSCFG[magic] FSTARGET=5.79 > > [....] > > > > OTHER QOS CONF > > [...] > > > > > what does a diagnose -p report? I dont' have jobs running and my testing nodes are down (except my torque-test server). But I can tell you that my jobs where on top (I'm arnaubria user, so my prio is 100000.... ) > Is it possible that the jobs which are running before your highest > priority job are not being backfilled but having a higher priority > instead due to the weights of the other metrics? > I see that the CREDWEIGHT is set to 1 while QOS for example is set to > 100. No, that's impossible. Other users prio are based on FS. Their prio go from negative values to a prio of 200... I've never seen a prio superior to that. > Also there are some groups with priority really high ( 100000) Those are very special groups (not the ones casuing problems) and myself. Let me try to quick reproduce a case in my prod cluster. I'll come back in a few. Cheers and thans for your replies, Arnau _______________________________________________ mauiusers mailing list [email protected] http://www.supercluster.org/mailman/listinfo/mauiusers
