[slurm-dev] Re: Fwd: Scheduling jobs according to the CPU load

TO_Webmaster Sun, 19 Mar 2017 03:26:32 -0700

Please remember that might lead to a huge waste of resources. Imagine
you have a cluster with 10 nodes with 10 cores eachs. Then somebody
submits 10 jobs requesting 1 core per job. If I understand you
correctly, you would like to see one job per node then? Now imagine
someone else submits 9 jobs requesting the nodes exclusively. Then
none of these 9 jobs can start, because there is one job with one core
on each node. If the former 10 jobs had been packed on one node, all
of the latter 9 jobs could have started immediately.


2017-03-18 19:06 GMT+01:00 kesim <ketiw...@gmail.com>:
> Dear John,
>
> Thank you for your answer. Obviously you are right that I could slurm up
> everything and thus avoid the issue and your points are taken. However, I
> still insist that it is a serious bug not to take into account the actual
> CPU load when the scheduler submit a job regardless whose fault it is that a
> non-slurm job is running. I would not suspect that from even simplest
> scheduler and if I had such prior knowledge I would not invest so much time
> and effort  to setup slurm.
> Best regards,
>
> Ketiw
>
> On Sat, Mar 18, 2017 at 5:42 PM, John Hearns <john.hea...@xma.co.uk> wrote:
>>
>>
>> Kesim,
>>
>> what you are saying is that Slurm schedukes tasks based on the number of
>> allocated CPUs. Rather than the actual load factor on the server.
>> As I recall Gridengine actually used the load factor.
>>
>> However you comment that "users run programs on the nodes" and "the slurm
>> is aware about the load of non-slurm jobs"
>> IMHO, in any well-run HPC setup any user running jobs without using the
>> scheduler would have their fingers broken. or at least bruised using the
>> clue stick.
>>
>> Seriously, three points:
>>
>> a) tell users to use 'salloc' and 'srun'  to run interactive jobs. They
>> can easily open a Bash session on a compute node and do what they like.
>> Under the Slurm scheduler.
>>
>> b) implement the pam-slurm PAMmodule. It is a few minutes work. This means
>> your users cannot go behind the sluem scheduler and log into the nodes
>>
>> c) on Bright clusters which I configure, you have a healtcheck running
>> which wans you when a user is detected as logging in withotu using Slurm
>>
>>
>> Seriously again. You have implemented an HPC infrastructure, and have gone
>> to the time and effort to implement a batch scheduling system.
>> A batch scheduler can be adapted to let your users do their jobs,
>> including interactive shell sessions and remote visualization sessions.
>> Do not let the users ride roughshod over you.
>>
>> ________________________________________
>> From: kesim [ketiw...@gmail.com]
>> Sent: 18 March 2017 16:16
>> To: slurm-dev
>> Subject: [slurm-dev] Re: Fwd: Scheduling jobs according to the CPU load
>>
>> Unbelievable but it seems that nobody knows how to do that. It is
>> astonishing that such sophisticated system fails with such simple problem.
>> The slurm is aware about the cpu load of non-slurm jobs but it does not use
>> the info. My original understanding of LLN was apparently correct. I can
>> practically kill the CPUs on particular node with nonslurm tasks but slurm
>> will diligently submit 7 jobs to this node leaving other idling.  I consider
>> this as a serious bug of this program.
>>
>>
>> On Fri, Mar 17, 2017 at 10:32 AM, kesim
>> <ketiw...@gmail.com<mailto:ketiw...@gmail.com>> wrote:
>> Dear All,
>> Yesterday I did some tests and it seemed that the scheduling is following
>> CPU load but I was wrong.
>> My configuration is at the moment:
>> SelectType=select/cons_res
>> SelectTypeParameters=CR_CPU,CR_LLN
>>
>> Today I submitted 70 threaded jobs to the queue and here is the CPU_LOAD
>> info
>> node1         0.08          7/0/0/7
>> node2        0.01          7/0/0/7
>> node3        0.00          7/0/0/7
>> node4        2.97          7/0/0/7
>> node5       0.00          7/0/0/7
>> node6         0.01          7/0/0/7
>> node7      0.00          7/0/0/7
>> node8       0.05          7/0/0/7
>> node9        0.07          7/0/0/7
>> node10        0.38          7/0/0/7
>> node11     0.01          0/7/0/7
>> As you can see it allocated 7 CPUs on node 4 with CPU_LOAD 2.97 and 0 CPUs
>> on idling node11. Why such simple thing is not a default? What am I
>> missing???
>>
>> On Thu, Mar 16, 2017 at 7:53 PM, kesim
>> <ketiw...@gmail.com<mailto:ketiw...@gmail.com>> wrote:
>> Than you for great suggestion. It is working! However the description of
>> CR_LLN is misleading "Schedule resources to jobs on the least loaded nodes
>> (based upon the number of idle CPUs)" Which I understood that if the two
>> nodes has not fully allocated CPUs  the node with smaller number of
>> allocated CPUs will take precedence. Therefore the bracketed comment should
>> be removed from the description.
>>
>> On Thu, Mar 16, 2017 at 6:24 PM, Paul Edmon
>> <ped...@cfa.harvard.edu<mailto:ped...@cfa.harvard.edu>> wrote:
>>
>> You should look at LLN (least loaded nodes):
>>
>> https://slurm.schedmd.com/slurm.conf.html
>>
>> That should do what you want.
>>
>> -Paul Edmon-
>>
>> On 03/16/2017 12:54 PM, kesim wrote:
>>
>> ---------- Forwarded message ----------
>> From: kesim <ketiw...@gmail.com<mailto:ketiw...@gmail.com>>
>> Date: Thu, Mar 16, 2017 at 5:50 PM
>> Subject: Scheduling jobs according to the CPU load
>> To: slurm-dev@schedmd.com<mailto:slurm-dev@schedmd.com>
>>
>>
>> Hi all,
>>
>> I am a new user and I created a small network of 11 nodes 7 CPUs per node
>> out of users desktops.
>> I configured slurm as:
>> SelectType=select/cons_res
>> SelectTypeParameters=CR_CPU
>> When I submit a task with srun -n70 task
>> It will fill 10 nodes with 7 tasks/node. However, I have no clue what is
>> the algorithm of choosing the nodes. Users run programs on the nodes and
>> some nodes are more busy than others. It seems logical that the scheduler
>> should submit the tasks to the less busy nodes but it is not the case.
>> In the sinfo -N -o '%N %O %C' I can see that the jobs are allocated to the
>> node11 with the load 2.06 leaving the node4 which is totally idling. That
>> somehow make no sense to me.
>> node1         0.00          7/0/0/7
>> node2        0.26          7/0/0/7
>> node3         0.54          7/0/0/7
>> node4        0.07          0/7/0/7
>> node5      0.00          7/0/0/7
>> node6        0.01          7/0/0/7
>> node7       0.00          7/0/0/7
>> node8       0.01          7/0/0/7
>> node9        0.06          7/0/0/7
>> node10      0.11          7/0/0/7
>> node11      2.06          7/0/0/7
>> How can I configure slurm to be able to fill the node with minimum load
>> first?
>>
>>
>>
>>
>>
>>
>> Any views or opinions presented in this email are solely those of the
>> author and do not necessarily represent those of the company. Employees of
>> XMA Ltd are expressly required not to make defamatory statements and not to
>> infringe or authorise any infringement of copyright or any other legal right
>> by email communications. Any such communication is contrary to company
>> policy and outside the scope of the employment of the individual concerned.
>> The company will not accept any liability in respect of such communication,
>> and the employee responsible will be personally liable for any damages or
>> other liability arising. XMA Limited is registered in England and Wales
>> (registered no. 2051703). Registered Office: Wilford Industrial Estate,
>> Ruddington Lane, Wilford, Nottingham, NG11 7EP
>
>

[slurm-dev] Re: Fwd: Scheduling jobs according to the CPU load

Reply via email to