Re hi,

Am 18.01.2016 um 01:39 schrieb Jordan Willis:
    CompleteWait=60
    SlurmdUser=root
                 ^^^^ side note: really root? Why not a dedicated user?
[...]
    FastSchedule=1
    SchedulerType=sched/backfill
    ClusterName=mycluster
    SelectType=select/cons_res
    SelectTypeParameters=CR_CPU,CR_LLN
^^^^^^^^^^^^^ Memory isn't configured as a consumable resource. So (Resources) as a "(REASON)" in squeue can't be due to memory unless you didn't properly reload the config.

(But see scontrol show mentioned later.)

    AccountingStorageType=accounting_storage/slurmdbd

    SallocDefaultCommand="srun --mem-per-cpu=0
 --pty --preserve-env
    --mpi=none $SHELL"

Your culprit? --mem-per-cpu=0
Do you specify more in the actual job file -- is it overridden?
I didn't see anything related to that in the documentation that says 0 means infinite. But maybe I'm missing something: why is it possible to allocate jobs at all with --mem-per-cpu=0

    NodeName=imperial-node[01-10] CPUs=24 RealMemory=64 Sockets=2
    CoresPerSocket=12 ThreadsPerCore=1
    NodeName=silver-node[01-28] CPUs=32 RealMemory=128 Sockets=2
    CoresPerSocket=16 ThreadsPerCore=1
    NodeName=ocean-node01 CPUs=16 RealMemory=256 Sockets=2
    CoresPerSocket=8 ThreadsPerCore=1
    NodeName=ocean-node[02-05] CPUs=32 RealMemory=256 Sockets=2
    CoresPerSocket=8 ThreadsPerCore=2
    NodeName=loma-node[01-10] CPUs=24 RealMemory=64 Sockets=2
    CoresPerSocket=6 ThreadsPerCore=2

As you can see, it looks like I should be treating all resrouces on a
per-cpu basis. I’m not sure what is wrong. Is there a command that I can
use on a job to check and see that those jobs that are queued are
waiting on an available CPU?
For me, they only say “Resources”, but can
you get more information about what Resources?

E.g.
scontrol show job <JOBID>

A censored (XXX) output of JOBID 2444 from our squeue:

JobId=2444 Name=XXX.sh
   UserId=XXX(XXX) GroupId=staff(50)
   Priority=2 Account=(null) QOS=(null)
   JobState=RUNNING Reason=None Dependency=(null)
   Requeue=1 Restarts=0 BatchFlag=1 ExitCode=0:0
   RunTime=1-06:37:26 TimeLimit=UNLIMITED TimeMin=N/A
   SubmitTime=2016-01-17T11:21:35 EligibleTime=2016-01-17T11:21:35
   StartTime=2016-01-17T11:21:35 EndTime=Unknown
   PreemptTime=None SuspendTime=None SecsPreSuspend=0
   Partition=MC20GBplus AllocNode:Sid=XXX:XXX
   ReqNodeList=(null) ExcNodeList=(null)
   NodeList=XXX
   BatchHost=XXX
   NumNodes=1 NumCPUs=4 CPUs/Task=4 ReqS:C:T=*:*:*
   MinCPUsNode=4 MinMemoryNode=23000M MinTmpDiskNode=0
   Features=(null) Gres=(null) Reservation=(null)
   Shared=OK Contiguous=0 Licenses=(null) Network=(null)
   Command=/var/XXX.sh
   WorkDir=/var/XXX


In summary:
1. memory isn't a consumable resource in your setup
2. just in case: "RealMemory=64" means 64MB. I would expect 64_Giga_B. so roundabout RealMemory=65365
Test case? (Currently won't be enforced anyway)
3. The only limit that is enforced is --mem-per-cpu has to be overrideen according to section "Memory Management"
http://slurm.schedmd.com/cons_res_share.html
Still: I wonder why you are able to start _any_ jobs at all with that limit? I am not sure about when the enforcement described in the last paragraph of "Memory Management" kicks in:
"
Enforcement of a jobs memory allocation is performed by setting the "maximum data segment size" and the "maximum virtual memory size" system limits to the appropriate values before launching the tasks. Enforcement is also managed by the accounting plugin, which periodically gathers data about running jobs. Set JobAcctGather and JobAcctFrequency to values suitable for your system.
"

Without fully understanding all the consequences of your configuration I wouldn't set --mem-per-cpu as part of SallocDefaultCommand in the slurm.conf and go with DefMemPerCPU, DefMemPerNode, MaxMemPerCPU and MaxMemPerNode as mentioned in the second last paragraph and let the user set --mem-per-cpu.
As recommended.

Regards, Benjamin


On Jan 16, 2016, at 7:34 AM, Benjamin Redling
<[email protected] <mailto:[email protected]>> wrote:


Hello Jordan,

On 2016-01-16 01:21, Jordan Willis wrote:
If my partition is used up according to the node configuration, but
still has available CPUS, is there a way to allow a user to who only
has a task that takes 1 cpu on that node?

For instance here is my partition:

NODELIST    NODES PARTITION  STATE  NODES(A/I) CPUS
      CPUS(A/I/O/T)   MEMORY
loma-node[     38 all*       mix    38/0       16+
       981/171/0/1152  64+


According to the nodes, there is nothing Idling, but there are 171
available cpus. Does anyone know what’s going on? When a new user
asks for 1 task, why can’t they get on one of those free cpus? What
should I change in my configuration.

without seeing your configuration thats just guesswork.
Are you using "select/linear" and "Shared=NO"?

Apart from that you might want to see the column "Resulting Behavior" to
get an idea what you have to check in your config:
http://slurm.schedmd.com/cons_res_share.html

Regards,
Benjamin
--
FSU Jena | JULIELab.de/Staff/Benjamin+Redling.html
vox: +49 3641 9 44323 | fax: +49 3641 9 44321

Reply via email to