>From the beginning:
I need to build a script, that reads files from certain location
(already done) gathers them and runs another resource-devouring script
using those files as an argument.
We have 12 nodes, 4 cpus each.
I need to assign 1 cpu per each file, but i can't reserve all
available cpus.
So i try to use slurm to gather those script commands in a batch file
and run them 1 file per cpu with max cpus, let's say, 20.
So far i couldn't get it to work properly. I got dozens of possible
outputs:
- Number of tasks running was OK, but all the tasks were the same
(same file, repeated 20 times).
- Number of tasks running was OK, but all the tasks were running on
one node (only 4 cpus assigned).
- Number of nodes reserved was OK, but the tasks were running on only
one of them
- Submission OK, but no tasks running whatsoever.
And a few other undesired outputs.
What i use is sbatch and srun.
The master script (the one that gathers files and runs sbatch has it's
slurm command like this:
"sbatch -J Name -c <user_defined_value> batchscript.log"
And the batch script (generated automatically) looks like this:
"
#!/bin/bash
srun -c 1 -n 1 scipt.py file1 &
srun -c 1 -n 1 scipt.py file2 &
srun -c 1 -n 1 scipt.py file3 &
srun -c 1 -n 1 scipt.py file4 &
.
.
.
srun -c 1 -n 1 scipt.py filen
wait ${!}
"
where n is the same number as in -c command
and for now it tells me that:
"sbatch: error: Batch job submission failed: Requested node
configuration is not available"
I don't know how to make it work my way.
On Nov 8, 1:54 pm, Carles Fenoy <[email protected]> wrote:
> Hi Chris,
>
> Can you explain more waht exactly are you using to submit the jobs and the
> script submitted?
>
> On Tue, Nov 8, 2011 at 12:22 PM, Chris Rataj <[email protected]> wrote:
> > I got something:
> > I tried to use ampersands and wait command, using -c command.
> > Each node has 4 cores, and when i tried to set number of cores/task to
> > ten, it didn't switch to another node but spew out an error:
> > "sbatch: error: Batch job submission failed: Requested node
> > configuration is not available"
> > Is it possible to force sbatch to open another node?
>
> --
> --
> Carles Fenoy