Unbelievable but it seems that nobody knows how to do that. It is
astonishing that such sophisticated system fails with such simple problem.
The slurm is aware about the cpu load of non-slurm jobs but it does not use
the info. My original understanding of LLN was apparently correct. I can
practically kill the CPUs on particular node with nonslurm tasks but slurm
will diligently submit 7 jobs to this node leaving other idling.  I
consider this as a serious bug of this program.


On Fri, Mar 17, 2017 at 10:32 AM, kesim <ketiw...@gmail.com> wrote:

> Dear All,
> Yesterday I did some tests and it seemed that the scheduling is following
> CPU load but I was wrong.
> My configuration is at the moment:
> SelectType=select/cons_res
> SelectTypeParameters=CR_CPU,CR_LLN
>
> Today I submitted 70 threaded jobs to the queue and here is the CPU_LOAD
> info
> node1         0.08          7/0/0/7
> node2        0.01          7/0/0/7
> node3        0.00          7/0/0/7
> node4        2.97          7/0/0/7
> node5       0.00          7/0/0/7
> node6         0.01          7/0/0/7
> node7      0.00          7/0/0/7
> node8       0.05          7/0/0/7
> node9        0.07          7/0/0/7
> node10        0.38          7/0/0/7
> node11     0.01          0/7/0/7
> As you can see it allocated 7 CPUs on node 4 with CPU_LOAD 2.97 and 0 CPUs
> on idling node11. Why such simple thing is not a default? What am I
> missing???
>
> On Thu, Mar 16, 2017 at 7:53 PM, kesim <ketiw...@gmail.com> wrote:
>
>> Than you for great suggestion. It is working! However the description of
>> CR_LLN is misleading "Schedule resources to jobs on the least loaded nodes
>> (based upon the number of idle CPUs)" Which I understood that if the two
>> nodes has not fully allocated CPUs  the node with smaller number of
>> allocated CPUs will take precedence. Therefore the bracketed comment should
>> be removed from the description.
>>
>> On Thu, Mar 16, 2017 at 6:24 PM, Paul Edmon <ped...@cfa.harvard.edu>
>> wrote:
>>
>>> You should look at LLN (least loaded nodes):
>>>
>>> https://slurm.schedmd.com/slurm.conf.html
>>>
>>> That should do what you want.
>>> -Paul Edmon-
>>>
>>> On 03/16/2017 12:54 PM, kesim wrote:
>>>
>>>
>>> ---------- Forwarded message ----------
>>> From: kesim <ketiw...@gmail.com>
>>> Date: Thu, Mar 16, 2017 at 5:50 PM
>>> Subject: Scheduling jobs according to the CPU load
>>> To: slurm-dev@schedmd.com
>>>
>>>
>>> Hi all,
>>>
>>> I am a new user and I created a small network of 11 nodes 7 CPUs per
>>> node out of users desktops.
>>> I configured slurm as:
>>> SelectType=select/cons_res
>>> SelectTypeParameters=CR_CPU
>>> When I submit a task with srun -n70 task
>>> It will fill 10 nodes with 7 tasks/node. However, I have no clue what is
>>> the algorithm of choosing the nodes. Users run programs on the nodes and
>>> some nodes are more busy than others. It seems logical that the scheduler
>>> should submit the tasks to the less busy nodes but it is not the case.
>>> In the sinfo -N -o '%N %O %C' I can see that the jobs are allocated to
>>> the node11 with the load 2.06 leaving the node4 which is totally idling.
>>> That somehow make no sense to me.
>>> node1         0.00          7/0/0/7
>>> node2        0.26          7/0/0/7
>>> node3         0.54          7/0/0/7
>>> node4        0.07          0/7/0/7
>>> node5      0.00          7/0/0/7
>>> node6        0.01          7/0/0/7
>>> node7       0.00          7/0/0/7
>>> node8       0.01          7/0/0/7
>>> node9        0.06          7/0/0/7
>>> node10      0.11          7/0/0/7
>>> node11      2.06          7/0/0/7
>>> How can I configure slurm to be able to fill the node with minimum load
>>> first?
>>>
>>>
>>>
>>>
>>
>

Reply via email to