Thank you Morris, LLN looks like it may be what I'm looking for.
If it allocates jobs based on CPU load, what happens when a node is running a high-memory, low-cpu job and somebody schedules a high-cpu, low-memory job that won't fit on that node because of memory requirements? On Thu, May 8, 2014 at 1:34 PM, Sean Caron <[email protected]> wrote: > So, say, you've got some (large) number of jobs that are more or less > "perfectly" parallel, that is, they don't really have anything to do with > one another; they can be executed in any sequence and have no interlocking > dependencies; the only communication is to and from the client node to the > gateway node to fetch more data, or to output results; there is no > communication between client nodes. In this case, it would be preferable to > run LLN (or CR_LLN? which? why?) versus running CR_Core_Memory? > > Best, > > Sean > > > > On Thu, May 8, 2014 at 4:21 PM, Lloyd Brown <[email protected]> wrote: > >> >> Don't forget the communication implications of the task distribution >> either. In general, if you start with fewer nodes, and change to more >> nodes (for the same total number of processes), your communication is >> more likely to be going between nodes, which will be slower than working >> within a node. >> >> Also, if the communication pattern is highly adjacent (lots of >> communication with near neighbors, but not much to farther neighbors), >> using a cyclic allocation may also hurt you, even in the same >> node/processes-per-node allocation, since the neighbors to a process are >> more likely to be on different nodes. >> >> How severe this is will depend on the specific algorithm, software, >> communication pattern, etc. But as you work with your users, it's worth >> considering. >> >> Lloyd Brown >> Systems Administrator >> Fulton Supercomputing Lab >> Brigham Young University >> http://marylou.byu.edu >> >> On 05/08/2014 12:01 PM, Ryan Cox wrote: >> > Rather than maximize fragmentation, you probably want to do it on a >> > per-job basis. If you want one core per node: sbatch: -N $numnodes -n >> > $numnodes. Anything else would require the -m flag. I haven't played >> > with it recently but I think you would want -m cyclic. >> > >> > Ryan >> > >> > On 05/08/2014 11:49 AM, Atom Powers wrote: >> >> How to spread jobs among nodes? >> >> >> >> It appears that my Slurm cluster is scheduling jobs to load up nodes >> >> as much as possible before putting jobs on other nodes. I understand >> >> the reasons for doing this, however I foresee my users wanting to >> >> spread jobs out among as many nodes as possible for various reasons, >> >> some of which are even valid. >> >> >> >> How would I configure the scheduler to distribute jobs in something >> >> like a round-robin fashion to many nodes instead of loading jobs onto >> >> just a few nodes? >> >> >> >> I currently have: >> >> 'SchedulerType' => 'sched/builtin', >> >> 'SelectTypeParameters' => 'CR_Core_Memory', >> >> 'SelectType' => 'select/cons_res', >> >> >> >> -- >> >> Perfection is just a word I use occasionally with mustard. >> >> --Atom Powers-- >> > >> > -- >> > Ryan Cox >> > Operations Director >> > Fulton Supercomputing Lab >> > Brigham Young University >> > >> > > -- Perfection is just a word I use occasionally with mustard. --Atom Powers--
