[slurm-dev] RE: Distribute M jobs on N nodes without duplication

2015-10-02 Thread John Hearns


So far I tried my hands with SRUN, SBATCH and SALLOC, and thought SBATCH will 
do what I am looking for.  However, SBATCH starts with assigning the requested 
resource configuration but then runs every srun command on every node.  For 
instance, if my script looks like:



sbatch is the command to submit a batch job – which usually consists of a shell 
script with some #SBATCH style parameters at the start

srun is the command to start an interactive job session, or to run a command 
from the command line.

I wouldn’t do an srun in the middle of a batch job….  Why not just subnt for 
separate batch jobs?
Or you could use a small job array http://slurm.schedmd.com/job_array.html

With the single line in the script:

./mycode.x  < input.in > output${SLURM_ARRAY_TASK_ID}.out


Also as it is running as a batch job you would not need the &






Do you have 2 GPUs in each node? DO you want to run two jobs on each node, or 
just one?



Scanned by MailMarshal - M86 Security's comprehensive email content security 
solution.


Any views or opinions presented in this email are solely those of the author 
and do not necessarily represent those of the company. Employees of XMA Ltd are 
expressly required not to make defamatory statements and not to infringe or 
authorise any infringement of copyright or any other legal right by email 
communications. Any such communication is contrary to company policy and 
outside the scope of the employment of the individual concerned. The company 
will not accept any liability in respect of such communication, and the 
employee responsible will be personally liable for any damages or other 
liability arising. XMA Limited is registered in England and Wales (registered 
no. 2051703). Registered Office: Wilford Industrial Estate, Ruddington Lane, 
Wilford, Nottingham, NG11 7EP


[slurm-dev] RE: Distribute M jobs on N nodes without duplication

2015-10-02 Thread DIAM code distribution DIAM/CDRH/FDA
>
>
>
> I wouldn’t do an srun in the middle of a batch job….  Why not just subnt
> for separate batch jobs?
>
> Or you could use a small job array http://slurm.schedmd.com/job_array.html
>
>
>
All the examples for SBATCH (in the SLURM manual) uses 'SRUN' for execution
of runs.  There are lot of other websites which gives SBATCH examples and
all of them uses SRUN, unless using some version of MPI.


Do you have 2 GPUs in each node? DO you want to run two jobs on each node,
or just one?

I wish to run minimum 2 GPU jobs on each node.


[slurm-dev] RE: Distribute M jobs on N nodes without duplication

2015-10-02 Thread John Hearns
I stand corrected.

I find myself in a maze of twisty little passages, all alike

All the examples for SBATCH (in the SLURM manual) uses 'SRUN' for execution of 
runs.  There are lot of other websites which gives SBATCH examples and all of 
them uses SRUN, unless using some version of MPI.




Scanned by MailMarshal - M86 Security's comprehensive email content security 
solution.


Any views or opinions presented in this email are solely those of the author 
and do not necessarily represent those of the company. Employees of XMA Ltd are 
expressly required not to make defamatory statements and not to infringe or 
authorise any infringement of copyright or any other legal right by email 
communications. Any such communication is contrary to company policy and 
outside the scope of the employment of the individual concerned. The company 
will not accept any liability in respect of such communication, and the 
employee responsible will be personally liable for any damages or other 
liability arising. XMA Limited is registered in England and Wales (registered 
no. 2051703). Registered Office: Wilford Industrial Estate, Ruddington Lane, 
Wilford, Nottingham, NG11 7EP


[slurm-dev] RE: Distribute M jobs on N nodes without duplication

2015-10-02 Thread Douglas Jacobsen
Hi, I'm not sure I understand the problem but you can specify -N (--nodes)
and tasks and so on for each srun.  That way you can control how many nodes
and tasks are distributed per srun:

srun -N 1 --gres=gpu:1 ...
srun -N 1 --gres=gpu:1 ...

from your original example should work..

-Doug



On Fri, Oct 2, 2015 at 2:11 PM, DIAM code distribution DIAM/CDRH/FDA <
diamc...@gmail.com> wrote:

> Anyone please help how to achieve this very basic kind of job
> distribution?  This problem has not been solved yet.
>
> On Fri, Oct 2, 2015 at 12:49 PM, John Hearns 
> wrote:
>
>> I stand corrected.
>>
>>
>>
>> I find myself in a maze of twisty little passages, all alike
>>
>>
>>
>> All the examples for SBATCH (in the SLURM manual) uses 'SRUN' for
>> execution of runs.  There are lot of other websites which gives SBATCH
>> examples and all of them uses SRUN, unless using some version of MPI.
>>
>>
>>
>>
>> --
>>
>> Scanned by *MailMarshal* - M86 Security's comprehensive email content
>> security solution.
>>
>> --
>> Any views or opinions presented in this email are solely those of the
>> author and do not necessarily represent those of the company. Employees of
>> XMA Ltd are expressly required not to make defamatory statements and not to
>> infringe or authorise any infringement of copyright or any other legal
>> right by email communications. Any such communication is contrary to
>> company policy and outside the scope of the employment of the individual
>> concerned. The company will not accept any liability in respect of such
>> communication, and the employee responsible will be personally liable for
>> any damages or other liability arising. XMA Limited is registered in
>> England and Wales (registered no. 2051703). Registered Office: Wilford
>> Industrial Estate, Ruddington Lane, Wilford, Nottingham, NG11 7EP
>>
>
>