On 12 February 2017 at 16:06, Travis DePrato <trav...@umich.edu> wrote:
> I've tried multiple variations of the SelectTypeParameters option (before > sending this mail) to no success. > > Currently it's http://pastebin.com/ATcsvvtQ with > SelectTypeParameters=CR_CPU_Memory > > I'm running 10 jobs, each single threaded/processed/etc., just sitting on > "sleep 1000", but I can never get more than 8 to run at a time, and I still > can't memory other than 1. > > I always ask the stupid questions: you are changing the conf, distributing that change to all nodes, restarting slurmctld then running scontrol reconfigure? cheers L. ------ The most dangerous phrase in the language is, "We've always done it this way." - Grace Hopper > On Sat, Feb 11, 2017 at 3:26 AM Lachlan Musicman <data...@gmail.com> > wrote: > >> 1. As EV noted, to get Memory as a consumable resource, you will need to >> add it to the line that says CR_CPU - change to CR_CPU_Memory >> https://slurm.schedmd.com/slurm.conf.html >> >> 2. That's because of the CR_CPU combined with cons_res. Change to CR_CORE >> for per core or CR_SOCKET for per socket. For definitions of each, there's >> a hardware page: >> >> https://slurm.schedmd.com/cons_res.html >> >> but for the cpu/core/socket definition, I found the image at the top of >> this page very helpful >> >> https://slurm.schedmd.com/mc_support.html >> >> L. >> >> ------ >> The most dangerous phrase in the language is, "We've always done it this >> way." >> >> - Grace Hopper >> >> On 11 February 2017 at 07:31, E V <eliven...@gmail.com> wrote: >> >> >> man slurm.conf and search for cons_res, you need to make a change from >> the defaults. Don't remember the details ATM, but that should get you >> started. >> >> On Fri, Feb 10, 2017 at 2:42 PM, Travis DePrato <trav...@umich.edu> >> wrote: >> > For reference, slurm.conf: http://pastebin.com/XT6TvQhh >> > >> > I've been tasked with setting up a small cluster for a research group >> where >> > I work, despite knowing relatively little about HPC or clusters in >> general. >> > I've installed slurm on the eight compute nodes and the login node, >> but, I'm >> > having two issues currently: >> > >> > 1. I cannot specify a memory requirement other than --mem=1 >> > Sample submission output with --mem=2: http://pastebin.com/5PY9N6n4 >> > >> > 2. I cannot get nodes to execute more than one job at a time. The 9th >> job is >> > always queued with reason Resources. I think this is related to the >> lines >> > >> > scontrol: Consumable Resources (CR) Node Selection plugin loaded with >> > argument 17 >> > scontrol: Serial Job Resource Selection plugin loaded with argument 17 >> > scontrol: Linear node selection plugin loaded with argument 17 >> > >> > because it seems like slurm is only allocating whole nodes at a time. >> > >> > Sorry if this is basic setup, but I've tried googling to no end. >> > -- >> > Travis DePrato >> > Computer Science & Engineering >> > Math and Music Minors >> > Student at University of Michigan >> > Computer Consultant at EECS DCO >> >> >> -- > Travis DePrato > Computer Science & Engineering > Math and Music Minors > Student at University of Michigan > Computer Consultant at EECS DCO >