Thanks Alan, that's a good point!

>From my reading of the documentation, and from what I see when I examine
the job, slurm is accepting it and simply adjusting the parameters to
match. In this instance even though I only requested 1 CPU from srun,
scontrol show job lists the following:

   MinCPUsNode=2 MinMemoryNode=4G MinTmpDiskNode=0

When I use --mem=4G instead of --mem-per-cpu=4G it all behaves exactly as
expected.

The problem is that the job gets accepted, and is even flagged as running
when viewed in squeue even though the allocated node knows nothing about
the job! I would prefer in this case then that slurm simply reject the job
submission.

What we are really trying to do is prevent users requesting a lot of RAM
per job dominating our resources. We're using Multifactor + FairShare + QOS
and it's working well from a CPU usage point of view to even out access to
all our users.

Does anyone have a good incantation for limiting the memory hogs or know
when the hinted at memory integral component of priority_multifactor will
be available?

Chris



On Fri, Jun 14, 2013 at 2:55 PM, Alan V. Cowles <[email protected]>wrote:

>  Hey Chris,
>
> I believe based on a similar user case we experienced here, that the srun
> option you are passing '--mem-per-cpu=4G' is what you are requesting not
> defining. By setting your max to 2048, and submitting your job you are
> requesting a node with twice those resources, which means there are no
> available resources. Thus slurmctld cannot satisfy your request. I bet if
> you modified your request to =2G it would run fine, or set your
> MaxMemPerCPU=4096 then your job in it's current state would run fine.
>
> AC
>
> On 06/14/2013 03:49 PM, Chris Read wrote:
>
> I've just seen similar behaviour with slurm 2.5.7.
>
>  When we set the following in slurm.conf:
>
>  MaxMemPerCPU=2048
>
>  And run the following command:
>
>  srun --mem-per-cpu=4G hostname
>
>  We get the following on the command line:
>
>   srun: job 27801 queued and waiting for resources
> srun: job 27801 has been allocated resources
> srun: Job step creation temporarily disabled, retrying
>
>  With slurmctld and slurmd both running with '-vvv' we see the following
> in the log files:
>
>  slurmctld.log:
>
>  [2013-06-14T14:42:53-05:00] debug2: select_p_job_test for job 27801
> [2013-06-14T14:42:53-05:00] debug2: got 1 threads to send out
> [2013-06-14T14:42:53-05:00] debug2: _adjust_limit_usage: job 27801: MPC:
> job_memory set to 4096
> [2013-06-14T14:42:53-05:00] debug2: Tree head got back 0 looking for 3
> [2013-06-14T14:42:53-05:00] sched: Allocate JobId=27801 NodeList=n32
> #CPUs=2
> [2013-06-14T14:42:53-05:00] debug2: Spawning RPC agent for msg_type 4002
> [2013-06-14T14:42:53-05:00] debug2: Performing full system state save
> [2013-06-14T14:42:53-05:00] debug2: got 1 threads to send out
> [2013-06-14T14:42:53-05:00] debug2: Tree head got back 1
> [2013-06-14T14:42:53-05:00] debug2: Tree head got back 2
> [2013-06-14T14:42:53-05:00] debug2: Tree head got back 3
> [2013-06-14T14:42:53-05:00] debug2: Tree head got them all
> [2013-06-14T14:42:53-05:00] debug2: _slurm_rpc_job_ready(27801)=3 usec=6
> [2013-06-14T14:42:53-05:00] debug2: Processing RPC:
> REQUEST_JOB_STEP_CREATE from uid=0
> [2013-06-14T14:42:53-05:00] debug:  Configuration for job 27801 complete
> [2013-06-14T14:42:53-05:00] _slurm_rpc_job_step_create for job 27801:
> Requested nodes are busy
> [2013-06-14T14:42:53-05:00] debug2: node_did_resp n31
> [2013-06-14T14:42:53-05:00] debug2: node_did_resp n33
> [2013-06-14T14:42:53-05:00] debug2: node_did_resp n32
> [2013-06-14T14:42:53-05:00] debug2: Processing RPC:
> REQUEST_JOB_STEP_CREATE from uid=0
> [2013-06-14T14:42:53-05:00] debug:  Configuration for job 27801 complete
> [2013-06-14T14:42:53-05:00] _slurm_rpc_job_step_create for job 27801:
> Requested nodes are busy
> [2013-06-14T14:42:54-05:00] debug2: Processing RPC:
> REQUEST_JOB_STEP_CREATE from uid=0
> [2013-06-14T14:42:54-05:00] debug:  Configuration for job 27801 complete
> [2013-06-14T14:42:54-05:00] _slurm_rpc_job_step_create for job 27801:
> Requested nodes are busy
> [2013-06-14T14:42:54-05:00] debug2: Processing RPC:
> REQUEST_JOB_STEP_CREATE from uid=0
> [2013-06-14T14:42:54-05:00] debug:  Configuration for job 27801 complete
> [2013-06-14T14:42:54-05:00] _slurm_rpc_job_step_create for job 27801:
> Requested nodes are busy
> [2013-06-14T14:42:55-05:00] debug2: Processing RPC:
> REQUEST_JOB_STEP_CREATE from uid=0
> [2013-06-14T14:42:55-05:00] debug:  Configuration for job 27801 complete
> [2013-06-14T14:42:55-05:00] _slurm_rpc_job_step_create for job 27801:
> Requested nodes are busy
> [2013-06-14T14:42:57-05:00] debug2: Processing RPC:
> REQUEST_JOB_STEP_CREATE from uid=0
> [2013-06-14T14:42:57-05:00] debug:  Configuration for job 27801 complete
> [2013-06-14T14:42:57-05:00] _slurm_rpc_job_step_create for job 27801:
> Requested nodes are busy
> [2013-06-14T14:43:02-05:00] debug2: Processing RPC:
> REQUEST_JOB_STEP_CREATE from uid=0
> [2013-06-14T14:43:02-05:00] debug:  Configuration for job 27801 complete
> [2013-06-14T14:43:02-05:00] _slurm_rpc_job_step_create for job 27801:
> Requested nodes are busy
>
>
>  slurmd.log:
>
>  [2013-06-14T14:42:53-05:00] debug2: got this type of message 1011
> [2013-06-14T14:42:53-05:00] debug2: Processing RPC: REQUEST_HEALTH_CHECK
> [2013-06-14T14:42:53-05:00] debug:  attempting to run health_check
> [/srv/slurm/sbin/healthcheck.sh]
>
>
>  It looks as thought the problem is solely with slurmctld as the slurmd
> never seems to get any request for the job!
>
>  Commenting out the MaxMemPerCPU makes it all better again...
>
>  Anyone have any ideas?
>
>  Chris
>
>
>
> On Mon, Jun 3, 2013 at 9:26 AM, Danny Auble <[email protected]> wrote:
>
>>  Pre1 its extremely old and most likely has many bugs. Please try pre4
>> (or better yet the git master) and see if the problem still exists.
>>
>> Also I am not sure if you are aware or not but --ntasks and -n are the
>> same.
>>
>> Danny
>>
>>
>> Tommi T <[email protected]> wrote:
>>>
>>> Hi
>>>
>>> I don't understand why node is busy after job is launched.
>>>
>>> slurm 2.6.0-0pre1
>>>
>>> grep 86742 /slurmdb/log/Slurmctld.log
>>>
>>>
>>> [2013-06-03T09:45:46+03:00] _slurm_rpc_submit_batch_job JobId=86742 usec=589
>>> [2013-06-03T09:46:12+03:00] backfill: Started JobId=86742 on c196
>>> [2013-06-03T09:46:14+03:00] _slurm_rpc_job_step_create for job 86742: 
>>> Requested nodes are busy
>>>
>>>
>>> [2013-06-03T09:47:14+03:00] _slurm_rpc_job_step_create for job 86742: 
>>> Requested nodes are busy
>>>
>>> c196:
>>>
>>> scontrol show node c196
>>> NodeName=c196 Arch=x86_64 CoresPerSocket=8
>>> CPUAlloc=16 CPUErr=0 CPUTot=16 CPULoad=4.40 Features=(null)
>>>
>>>
>>> Gres=(null)
>>> NodeAddr=c196 NodeHostName=c196
>>> OS=Linux RealMemory=64000 Sockets=2 Boards=1
>>> State=ALLOCATED ThreadsPerCore=1 TmpDisk=1800000 Weight=10
>>> BootTime=2013-05-16T17:45:13
>>> SlurmdStartTime=2013-05-16T17:46:48
>>> CurrentWatts=0 LowestJoules=0 ConsumedJoules=0
>>>
>>>
>>> job error file:
>>> srun: mem < mem-per-cpu - resizing mem to be equal to mem-per-cpu
>>> srun: Job step creation temporarily disabled, retrying
>>>
>>>
>>>
>>> c196 Slurmd.log
>>>
>>> [2013-06-03T09:46:12+03:00] Launching batch job 86742 for UID 18991
>>> [2013-06-03T09:46:12+03:00] Job accounting gather LINUX plugin loaded
>>> [2013-06-03T09:46:12+03:00] switch NONE plugin loaded
>>>
>>>
>>> [2013-06-03T09:46:12+03:00] Received cpu frequency information for 16 cpus
>>> [2013-06-03T09:46:12+03:00] [86742] task/cgroup: loaded
>>> [2013-06-03T09:46:12+03:00] [86742] Checkpoint plugin loaded: 
>>> checkpoint/none
>>> [2013-06-03T09:46:12+03:00] [86742] debug level = 2
>>>
>>>
>>> [2013-06-03T09:46:12+03:00] [86742] task 0 (18044) started 
>>> 2013-06-03T09:46:12+03:00
>>> [2013-06-03T09:46:12+03:00] [86742] AcctGatherEnergy NONE plugin loaded
>>> [2013-06-03T09:46:42+03:00] Launching batch job
>>> 86743 for UID 18991
>>> [2013-06-03T09:46:42+03:00] Job accounting gather LINUX plugin loaded
>>> [2013-06-03T09:46:42+03:00] switch NONE plugin loaded
>>> [2013-06-03T09:46:42+03:00] Received cpu frequency information for 16 cpus
>>>
>>>
>>> [2013-06-03T09:46:42+03:00] [86743] task/cgroup: loaded
>>> [2013-06-03T09:46:42+03:00] [86743] Checkpoint plugin loaded: 
>>> checkpoint/none
>>> [2013-06-03T09:46:42+03:00] [86743] debug level = 2
>>> [2013-06-03T09:46:42+03:00] [86743] task 0 (18462) started 
>>> 2013-06-03T09:46:42+03:00
>>>
>>>
>>> [2013-06-03T09:46:42+03:00] [86743] AcctGatherEnergy NONE plugin loaded
>>> [2013-06-03T10:02:17+03:00] [86743] auth plugin for Munge 
>>> (http://code.google.com/p/munge/) loaded
>>>
>>>
>>> [2013-06-03T10:02:17+03:00] [86742] auth plugin for Munge 
>>> (http://code.google.com/p/munge/) loaded
>>>
>>>
>>> 18041 ?        Sl     0:00 slurmstepd: [86742]
>>>
>>>
>>> 18044 ?        S      0:00
>>> /bin/bash /slurmdb/tmp/slurmd/job86742/slurm_script
>>> 18450 ?        S      0:00 srun blastn -num_alignments 5 -num_threads 6 
>>> -query /wrk/user/pb_20506_tmpdir/pb_chunk_00005.fasta -db nt -out 
>>> /wrk/user/pb_20506_tmpdir/pb_chunk_00005.fasta.result
>>>
>>>
>>>
>>>
>>> #SBATCH -t 24:00:00
>>> #SBATCH -n 6
>>> #SBATCH -p parallel
>>> #SBATCH --nodes 1
>>> #SBATCH --ntasks 1
>>> #SBATCH --cpus-per-task=6
>>> #SBATCH --mem 16000
>>>
>>> Best Regards,
>>> Tommi
>>>
>>>
>
>

Reply via email to