Please remember that might lead to a huge waste of resources. Imagine you have a cluster with 10 nodes with 10 cores eachs. Then somebody submits 10 jobs requesting 1 core per job. If I understand you correctly, you would like to see one job per node then? Now imagine someone else submits 9 jobs requesting the nodes exclusively. Then none of these 9 jobs can start, because there is one job with one core on each node. If the former 10 jobs had been packed on one node, all of the latter 9 jobs could have started immediately.
2017-03-18 19:06 GMT+01:00 kesim <ketiw...@gmail.com>: > Dear John, > > Thank you for your answer. Obviously you are right that I could slurm up > everything and thus avoid the issue and your points are taken. However, I > still insist that it is a serious bug not to take into account the actual > CPU load when the scheduler submit a job regardless whose fault it is that a > non-slurm job is running. I would not suspect that from even simplest > scheduler and if I had such prior knowledge I would not invest so much time > and effort to setup slurm. > Best regards, > > Ketiw > > On Sat, Mar 18, 2017 at 5:42 PM, John Hearns <john.hea...@xma.co.uk> wrote: >> >> >> Kesim, >> >> what you are saying is that Slurm schedukes tasks based on the number of >> allocated CPUs. Rather than the actual load factor on the server. >> As I recall Gridengine actually used the load factor. >> >> However you comment that "users run programs on the nodes" and "the slurm >> is aware about the load of non-slurm jobs" >> IMHO, in any well-run HPC setup any user running jobs without using the >> scheduler would have their fingers broken. or at least bruised using the >> clue stick. >> >> Seriously, three points: >> >> a) tell users to use 'salloc' and 'srun' to run interactive jobs. They >> can easily open a Bash session on a compute node and do what they like. >> Under the Slurm scheduler. >> >> b) implement the pam-slurm PAMmodule. It is a few minutes work. This means >> your users cannot go behind the sluem scheduler and log into the nodes >> >> c) on Bright clusters which I configure, you have a healtcheck running >> which wans you when a user is detected as logging in withotu using Slurm >> >> >> Seriously again. You have implemented an HPC infrastructure, and have gone >> to the time and effort to implement a batch scheduling system. >> A batch scheduler can be adapted to let your users do their jobs, >> including interactive shell sessions and remote visualization sessions. >> Do not let the users ride roughshod over you. >> >> ________________________________________ >> From: kesim [ketiw...@gmail.com] >> Sent: 18 March 2017 16:16 >> To: slurm-dev >> Subject: [slurm-dev] Re: Fwd: Scheduling jobs according to the CPU load >> >> Unbelievable but it seems that nobody knows how to do that. It is >> astonishing that such sophisticated system fails with such simple problem. >> The slurm is aware about the cpu load of non-slurm jobs but it does not use >> the info. My original understanding of LLN was apparently correct. I can >> practically kill the CPUs on particular node with nonslurm tasks but slurm >> will diligently submit 7 jobs to this node leaving other idling. I consider >> this as a serious bug of this program. >> >> >> On Fri, Mar 17, 2017 at 10:32 AM, kesim >> <ketiw...@gmail.com<mailto:ketiw...@gmail.com>> wrote: >> Dear All, >> Yesterday I did some tests and it seemed that the scheduling is following >> CPU load but I was wrong. >> My configuration is at the moment: >> SelectType=select/cons_res >> SelectTypeParameters=CR_CPU,CR_LLN >> >> Today I submitted 70 threaded jobs to the queue and here is the CPU_LOAD >> info >> node1 0.08 7/0/0/7 >> node2 0.01 7/0/0/7 >> node3 0.00 7/0/0/7 >> node4 2.97 7/0/0/7 >> node5 0.00 7/0/0/7 >> node6 0.01 7/0/0/7 >> node7 0.00 7/0/0/7 >> node8 0.05 7/0/0/7 >> node9 0.07 7/0/0/7 >> node10 0.38 7/0/0/7 >> node11 0.01 0/7/0/7 >> As you can see it allocated 7 CPUs on node 4 with CPU_LOAD 2.97 and 0 CPUs >> on idling node11. Why such simple thing is not a default? What am I >> missing??? >> >> On Thu, Mar 16, 2017 at 7:53 PM, kesim >> <ketiw...@gmail.com<mailto:ketiw...@gmail.com>> wrote: >> Than you for great suggestion. It is working! However the description of >> CR_LLN is misleading "Schedule resources to jobs on the least loaded nodes >> (based upon the number of idle CPUs)" Which I understood that if the two >> nodes has not fully allocated CPUs the node with smaller number of >> allocated CPUs will take precedence. Therefore the bracketed comment should >> be removed from the description. >> >> On Thu, Mar 16, 2017 at 6:24 PM, Paul Edmon >> <ped...@cfa.harvard.edu<mailto:ped...@cfa.harvard.edu>> wrote: >> >> You should look at LLN (least loaded nodes): >> >> https://slurm.schedmd.com/slurm.conf.html >> >> That should do what you want. >> >> -Paul Edmon- >> >> On 03/16/2017 12:54 PM, kesim wrote: >> >> ---------- Forwarded message ---------- >> From: kesim <ketiw...@gmail.com<mailto:ketiw...@gmail.com>> >> Date: Thu, Mar 16, 2017 at 5:50 PM >> Subject: Scheduling jobs according to the CPU load >> To: slurm-dev@schedmd.com<mailto:slurm-dev@schedmd.com> >> >> >> Hi all, >> >> I am a new user and I created a small network of 11 nodes 7 CPUs per node >> out of users desktops. >> I configured slurm as: >> SelectType=select/cons_res >> SelectTypeParameters=CR_CPU >> When I submit a task with srun -n70 task >> It will fill 10 nodes with 7 tasks/node. However, I have no clue what is >> the algorithm of choosing the nodes. Users run programs on the nodes and >> some nodes are more busy than others. It seems logical that the scheduler >> should submit the tasks to the less busy nodes but it is not the case. >> In the sinfo -N -o '%N %O %C' I can see that the jobs are allocated to the >> node11 with the load 2.06 leaving the node4 which is totally idling. That >> somehow make no sense to me. >> node1 0.00 7/0/0/7 >> node2 0.26 7/0/0/7 >> node3 0.54 7/0/0/7 >> node4 0.07 0/7/0/7 >> node5 0.00 7/0/0/7 >> node6 0.01 7/0/0/7 >> node7 0.00 7/0/0/7 >> node8 0.01 7/0/0/7 >> node9 0.06 7/0/0/7 >> node10 0.11 7/0/0/7 >> node11 2.06 7/0/0/7 >> How can I configure slurm to be able to fill the node with minimum load >> first? >> >> >> >> >> >> >> Any views or opinions presented in this email are solely those of the >> author and do not necessarily represent those of the company. Employees of >> XMA Ltd are expressly required not to make defamatory statements and not to >> infringe or authorise any infringement of copyright or any other legal right >> by email communications. Any such communication is contrary to company >> policy and outside the scope of the employment of the individual concerned. >> The company will not accept any liability in respect of such communication, >> and the employee responsible will be personally liable for any damages or >> other liability arising. XMA Limited is registered in England and Wales >> (registered no. 2051703). Registered Office: Wilford Industrial Estate, >> Ruddington Lane, Wilford, Nottingham, NG11 7EP > >