Hi,

Something doesn't seem to be working right with MaxMemPerCPU and
--mem-per-cpu increasing the CPU limits when MaxMemPerCPU is exceeded.

For reference the man page says this:

Note that if the job's --mem-per-cpu value exceeds the configured
MaxMemPerCPU, then the user's limit will be treated as a memory limit
per task; --mem-per-cpu will be reduced to a value no larger than
MaxMemPerCPU; --cpus-per-task will be set and value of --cpus-per-task
multiplied by the new --mem-per-cpu value will equal the original
--mem-per-cpu value specified by the user. 


I can't get that to happen with --mem-per-cpu, but that does happen when
I use --mem.

I have a partition named mic with DefMemPerCPU=2000 and MaxMemPerCPU=200
set.

I get this with --mem-per-cpu=2100

$ srun -p mic --mem-per-cpu=2100 ls
srun: error: Unable to allocate resources: Memory required by task is not 
available

But this works and increases the number of cpus:
$ srun -p mic --mem=2100 ls

With --mem it outputs this to the debug:
[2014-05-30T14:35:23.001] debug:  Setting job's pn_min_cpus to 2 due to memory 
limit


JobId=8792976 Name=ls
   UserId=wettstein(891783663) GroupId=wettstein(891783663)
   Priority=111812 Account=rcc-staff QOS=mic
   JobState=COMPLETED Reason=None Dependency=(null)
   Requeue=1 Restarts=0 BatchFlag=0 ExitCode=0:0
   RunTime=00:00:03 TimeLimit=1-12:00:00 TimeMin=N/A
   SubmitTime=2014-05-30T14:35:23 EligibleTime=2014-05-30T14:35:23
   StartTime=2014-05-30T14:35:23 EndTime=2014-05-30T14:35:26
   PreemptTime=None SuspendTime=None SecsPreSuspend=0
   Partition=mic AllocNode:Sid=midway-login2:27313
   ReqNodeList=(null) ExcNodeList=(null)
   NodeList=midway-mic01
   BatchHost=midway-mic01
   NumNodes=1 NumCPUs=2 CPUs/Task=1 ReqS:C:T=*:*:*
   MinCPUsNode=2 MinMemoryNode=2100M MinTmpDiskNode=0
   Features=(null) Gres=(null) Reservation=(null)
   Shared=OK Contiguous=0 Licenses=(null) Network=(null)
   Command=/bin/ls
   WorkDir=/software/src/slurm

I guess either the documentation is incorrect and this should be
described with the --mem option or there is a bug in the logic.

I basically want to use this to make users get charged for the whole
node instead of just requesting 1 cpu and all of the memory on the node.

Andy

-- 
andy wettstein
hpc system administrator
research computing center
university of chicago
773.702.1104

Reply via email to