[slurm-dev] RE: Distribute M jobs on N nodes without duplication
So far I tried my hands with SRUN, SBATCH and SALLOC, and thought SBATCH will do what I am looking for. However, SBATCH starts with assigning the requested resource configuration but then runs every srun command on every node. For instance, if my script looks like: sbatch is the command to submit a batch job – which usually consists of a shell script with some #SBATCH style parameters at the start srun is the command to start an interactive job session, or to run a command from the command line. I wouldn’t do an srun in the middle of a batch job…. Why not just subnt for separate batch jobs? Or you could use a small job array http://slurm.schedmd.com/job_array.html With the single line in the script: ./mycode.x < input.in > output${SLURM_ARRAY_TASK_ID}.out Also as it is running as a batch job you would not need the & Do you have 2 GPUs in each node? DO you want to run two jobs on each node, or just one? Scanned by MailMarshal - M86 Security's comprehensive email content security solution. Any views or opinions presented in this email are solely those of the author and do not necessarily represent those of the company. Employees of XMA Ltd are expressly required not to make defamatory statements and not to infringe or authorise any infringement of copyright or any other legal right by email communications. Any such communication is contrary to company policy and outside the scope of the employment of the individual concerned. The company will not accept any liability in respect of such communication, and the employee responsible will be personally liable for any damages or other liability arising. XMA Limited is registered in England and Wales (registered no. 2051703). Registered Office: Wilford Industrial Estate, Ruddington Lane, Wilford, Nottingham, NG11 7EP
[slurm-dev] RE: Distribute M jobs on N nodes without duplication
> > > > I wouldn’t do an srun in the middle of a batch job…. Why not just subnt > for separate batch jobs? > > Or you could use a small job array http://slurm.schedmd.com/job_array.html > > > All the examples for SBATCH (in the SLURM manual) uses 'SRUN' for execution of runs. There are lot of other websites which gives SBATCH examples and all of them uses SRUN, unless using some version of MPI. Do you have 2 GPUs in each node? DO you want to run two jobs on each node, or just one? I wish to run minimum 2 GPU jobs on each node.
[slurm-dev] RE: Distribute M jobs on N nodes without duplication
I stand corrected. I find myself in a maze of twisty little passages, all alike All the examples for SBATCH (in the SLURM manual) uses 'SRUN' for execution of runs. There are lot of other websites which gives SBATCH examples and all of them uses SRUN, unless using some version of MPI. Scanned by MailMarshal - M86 Security's comprehensive email content security solution. Any views or opinions presented in this email are solely those of the author and do not necessarily represent those of the company. Employees of XMA Ltd are expressly required not to make defamatory statements and not to infringe or authorise any infringement of copyright or any other legal right by email communications. Any such communication is contrary to company policy and outside the scope of the employment of the individual concerned. The company will not accept any liability in respect of such communication, and the employee responsible will be personally liable for any damages or other liability arising. XMA Limited is registered in England and Wales (registered no. 2051703). Registered Office: Wilford Industrial Estate, Ruddington Lane, Wilford, Nottingham, NG11 7EP
[slurm-dev] RE: Distribute M jobs on N nodes without duplication
Hi, I'm not sure I understand the problem but you can specify -N (--nodes) and tasks and so on for each srun. That way you can control how many nodes and tasks are distributed per srun: srun -N 1 --gres=gpu:1 ... srun -N 1 --gres=gpu:1 ... from your original example should work.. -Doug On Fri, Oct 2, 2015 at 2:11 PM, DIAM code distribution DIAM/CDRH/FDA < diamc...@gmail.com> wrote: > Anyone please help how to achieve this very basic kind of job > distribution? This problem has not been solved yet. > > On Fri, Oct 2, 2015 at 12:49 PM, John Hearns> wrote: > >> I stand corrected. >> >> >> >> I find myself in a maze of twisty little passages, all alike >> >> >> >> All the examples for SBATCH (in the SLURM manual) uses 'SRUN' for >> execution of runs. There are lot of other websites which gives SBATCH >> examples and all of them uses SRUN, unless using some version of MPI. >> >> >> >> >> -- >> >> Scanned by *MailMarshal* - M86 Security's comprehensive email content >> security solution. >> >> -- >> Any views or opinions presented in this email are solely those of the >> author and do not necessarily represent those of the company. Employees of >> XMA Ltd are expressly required not to make defamatory statements and not to >> infringe or authorise any infringement of copyright or any other legal >> right by email communications. Any such communication is contrary to >> company policy and outside the scope of the employment of the individual >> concerned. The company will not accept any liability in respect of such >> communication, and the employee responsible will be personally liable for >> any damages or other liability arising. XMA Limited is registered in >> England and Wales (registered no. 2051703). Registered Office: Wilford >> Industrial Estate, Ruddington Lane, Wilford, Nottingham, NG11 7EP >> > >