[slurm-dev] Re: Share free cpus

Benjamin Redling Mon, 18 Jan 2016 09:43:02 -0800


Re hi,


Am 18.01.2016 um 01:39 schrieb Jordan Willis:

    CompleteWait=60
    SlurmdUser=root

                 ^^^^ side note: really root? Why not a dedicated user?
[...]

    FastSchedule=1
    SchedulerType=sched/backfill
    ClusterName=mycluster
    SelectType=select/cons_res
    SelectTypeParameters=CR_CPU,CR_LLN

^^^^^^^^^^^^^ Memory isn't configured as aconsumable resource. So (Resources) as a "(REASON)" in squeue can't bedue to memory unless you didn't properly reload the config.


(But see scontrol show mentioned later.)

    AccountingStorageType=accounting_storage/slurmdbd

    SallocDefaultCommand="srun --mem-per-cpu=0

 --pty --preserve-env

    --mpi=none $SHELL"


Your culprit? --mem-per-cpu=0
Do you specify more in the actual job file -- is it overridden?

I didn't see anything related to that in the documentation that says 0means infinite.But maybe I'm missing something: why is it possible to allocate jobs atall with --mem-per-cpu=0

    NodeName=imperial-node[01-10] CPUs=24 RealMemory=64 Sockets=2
    CoresPerSocket=12 ThreadsPerCore=1
    NodeName=silver-node[01-28] CPUs=32 RealMemory=128 Sockets=2
    CoresPerSocket=16 ThreadsPerCore=1
    NodeName=ocean-node01 CPUs=16 RealMemory=256 Sockets=2
    CoresPerSocket=8 ThreadsPerCore=1
    NodeName=ocean-node[02-05] CPUs=32 RealMemory=256 Sockets=2
    CoresPerSocket=8 ThreadsPerCore=2
    NodeName=loma-node[01-10] CPUs=24 RealMemory=64 Sockets=2
    CoresPerSocket=6 ThreadsPerCore=2

As you can see, it looks like I should be treating all resrouces on a
per-cpu basis. I’m not sure what is wrong. Is there a command that I can
use on a job to check and see that those jobs that are queued are
waiting on an available CPU?
For me, they only say “Resources”, but can
you get more information about what Resources?


E.g.
scontrol show job <JOBID>

A censored (XXX) output of JOBID 2444 from our squeue:

JobId=2444 Name=XXX.sh
   UserId=XXX(XXX) GroupId=staff(50)
   Priority=2 Account=(null) QOS=(null)
   JobState=RUNNING Reason=None Dependency=(null)
   Requeue=1 Restarts=0 BatchFlag=1 ExitCode=0:0
   RunTime=1-06:37:26 TimeLimit=UNLIMITED TimeMin=N/A
   SubmitTime=2016-01-17T11:21:35 EligibleTime=2016-01-17T11:21:35
   StartTime=2016-01-17T11:21:35 EndTime=Unknown
   PreemptTime=None SuspendTime=None SecsPreSuspend=0
   Partition=MC20GBplus AllocNode:Sid=XXX:XXX
   ReqNodeList=(null) ExcNodeList=(null)
   NodeList=XXX
   BatchHost=XXX
   NumNodes=1 NumCPUs=4 CPUs/Task=4 ReqS:C:T=*:*:*
   MinCPUsNode=4 MinMemoryNode=23000M MinTmpDiskNode=0
   Features=(null) Gres=(null) Reservation=(null)
   Shared=OK Contiguous=0 Licenses=(null) Network=(null)
   Command=/var/XXX.sh
   WorkDir=/var/XXX


In summary:
1. memory isn't a consumable resource in your setup

2. just in case: "RealMemory=64" means 64MB. I would expect 64_Giga_B.so roundabout RealMemory=65365

Test case? (Currently won't be enforced anyway)

3. The only limit that is enforced is --mem-per-cpu has to be overrideenaccording to section "Memory Management"

http://slurm.schedmd.com/cons_res_share.html

Still: I wonder why you are able to start _any_ jobs at all with thatlimit? I am not sure about when the enforcement described in the lastparagraph of "Memory Management" kicks in:

Enforcement of a jobs memory allocation is performed by setting the"maximum data segment size" and the "maximum virtual memory size" systemlimits to the appropriate values before launching the tasks. Enforcementis also managed by the accounting plugin, which periodically gathersdata about running jobs. Set JobAcctGather and JobAcctFrequency tovalues suitable for your system.

Without fully understanding all the consequences of your configuration Iwouldn't set --mem-per-cpu as part of SallocDefaultCommand in theslurm.conf and go withDefMemPerCPU, DefMemPerNode, MaxMemPerCPU and MaxMemPerNode as mentionedin the second last paragraph and let the user set --mem-per-cpu.

As recommended.

Regards, Benjamin

On Jan 16, 2016, at 7:34 AM, Benjamin Redling
<[email protected] <mailto:[email protected]>> wrote:


Hello Jordan,

On 2016-01-16 01:21, Jordan Willis wrote:

If my partition is used up according to the node configuration, but
still has available CPUS, is there a way to allow a user to who only
has a task that takes 1 cpu on that node?

For instance here is my partition:

NODELIST    NODES PARTITION  STATE  NODES(A/I) CPUS
      CPUS(A/I/O/T)   MEMORY
loma-node[     38 all*       mix    38/0       16+
       981/171/0/1152  64+


According to the nodes, there is nothing Idling, but there are 171
available cpus. Does anyone know what’s going on? When a new user
asks for 1 task, why can’t they get on one of those free cpus? What
should I change in my configuration.


without seeing your configuration thats just guesswork.
Are you using "select/linear" and "Shared=NO"?

Apart from that you might want to see the column "Resulting Behavior" to
get an idea what you have to check in your config:
http://slurm.schedmd.com/cons_res_share.html

Regards,
Benjamin
--
FSU Jena | JULIELab.de/Staff/Benjamin+Redling.html
vox: +49 3641 9 44323 | fax: +49 3641 9 44321

[slurm-dev] Re: Share free cpus

Reply via email to