Dear All, Yesterday I did some tests and it seemed that the scheduling is following CPU load but I was wrong. My configuration is at the moment: SelectType=select/cons_res SelectTypeParameters=CR_CPU,CR_LLN
Today I submitted 70 threaded jobs to the queue and here is the CPU_LOAD info node1 0.08 7/0/0/7 node2 0.01 7/0/0/7 node3 0.00 7/0/0/7 node4 2.97 7/0/0/7 node5 0.00 7/0/0/7 node6 0.01 7/0/0/7 node7 0.00 7/0/0/7 node8 0.05 7/0/0/7 node9 0.07 7/0/0/7 node10 0.38 7/0/0/7 node11 0.01 0/7/0/7 As you can see it allocated 7 CPUs on node 4 with CPU_LOAD 2.97 and 0 CPUs on idling node11. Why such simple thing is not a default? What am I missing??? On Thu, Mar 16, 2017 at 7:53 PM, kesim <ketiw...@gmail.com> wrote: > Than you for great suggestion. It is working! However the description of > CR_LLN is misleading "Schedule resources to jobs on the least loaded nodes > (based upon the number of idle CPUs)" Which I understood that if the two > nodes has not fully allocated CPUs the node with smaller number of > allocated CPUs will take precedence. Therefore the bracketed comment should > be removed from the description. > > On Thu, Mar 16, 2017 at 6:24 PM, Paul Edmon <ped...@cfa.harvard.edu> > wrote: > >> You should look at LLN (least loaded nodes): >> >> https://slurm.schedmd.com/slurm.conf.html >> >> That should do what you want. >> -Paul Edmon- >> >> On 03/16/2017 12:54 PM, kesim wrote: >> >> >> ---------- Forwarded message ---------- >> From: kesim <ketiw...@gmail.com> >> Date: Thu, Mar 16, 2017 at 5:50 PM >> Subject: Scheduling jobs according to the CPU load >> To: slurm-dev@schedmd.com >> >> >> Hi all, >> >> I am a new user and I created a small network of 11 nodes 7 CPUs per node >> out of users desktops. >> I configured slurm as: >> SelectType=select/cons_res >> SelectTypeParameters=CR_CPU >> When I submit a task with srun -n70 task >> It will fill 10 nodes with 7 tasks/node. However, I have no clue what is >> the algorithm of choosing the nodes. Users run programs on the nodes and >> some nodes are more busy than others. It seems logical that the scheduler >> should submit the tasks to the less busy nodes but it is not the case. >> In the sinfo -N -o '%N %O %C' I can see that the jobs are allocated to >> the node11 with the load 2.06 leaving the node4 which is totally idling. >> That somehow make no sense to me. >> node1 0.00 7/0/0/7 >> node2 0.26 7/0/0/7 >> node3 0.54 7/0/0/7 >> node4 0.07 0/7/0/7 >> node5 0.00 7/0/0/7 >> node6 0.01 7/0/0/7 >> node7 0.00 7/0/0/7 >> node8 0.01 7/0/0/7 >> node9 0.06 7/0/0/7 >> node10 0.11 7/0/0/7 >> node11 2.06 7/0/0/7 >> How can I configure slurm to be able to fill the node with minimum load >> first? >> >> >> >> >