Hi all, I'd like to reserve nodes for an MPI job so I though of using --cpus-per-task. The following submit script would allow me to run 24 mpi processes scattered over 24 nodes (since our nodes have 16 "cpus"):
> #!/bin/bash > > #SBATCH --error=output/err_%j.log > #SBATCH --output=output/out_%j.log > #SBATCH --partition=normalnodes > #SBATCH --ntasks=24 > #SBATCH --cpus-per-task=16 > > mpirun ./helloworld > This should allocate the 16 cpus of each node to the job, even though there would be only one mpi process on each node. Unfortunately, I'm getting this error in the output/err_%j.log: > > -------------------------------------------------------------------------- > All nodes which are allocated for this job are already filled. > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > A daemon (pid unknown) died unexpectedly on signal 1 while attempting to > launch so we are aborting. > > There may be more information reported by the environment (see above). > > This may be because the daemon was unable to find all the needed shared > libraries on the remote node. You may set your LD_LIBRARY_PATH to have the > location of the shared libraries on the remote nodes and this will > automatically be forwarded to the remote nodes. > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > mpirun noticed that the job aborted, but has no info as to the process > that caused that situation. > -------------------------------------------------------------------------- > Googling around I've found this thread http://www.open-mpi.org/community/lists/users/2010/07/13479.php talking about this. It seems slurm and openmpi have trouble communicating this. I'd like to know if it is still the case or if I'm just doing something wrong. To bypass this, I think the combination of --ntasks=24, --nodes=8 and --exclusive could be used. In that particular example, 3 cpus would be allocated on 8 nodes and these nodes would be reserved by the job, meaning no other job could be run on these nodes. Am I right? Thanks for your input. Nicolas
