I should have been more careful/clear in my answer - we don't look at the SLURM_OVERCOMMIT variable.

The srun cmd options are not utilized. If setting SLURM_OVERCOMMIT worked in version 1.2.8, I can assure you it was completely fortuitous - I wrote that code, and we never looked at that variable. The difference in behavior in 1.3 is caused by a change requested by the SLURM developers where we now use SLURM_JOB_CPUS_PER_NODE to determine the number of processors assigned to us instead of SLURM_CPUS_PER_TASK (as was done in 1.2.x). This was required due to a change in SLURM 1.3 that modified the definitions of these values.

If you look at our documentation (e.g., "man mpirun") you will see the equivalent set of options to control our mappers. There are a variety of mapper options you can use, including the ability to oversubscribe processors. The definition of these have remained constant across the versions.

Ralph


On Mar 30, 2009, at 2:51 AM, Hartmut Häfner wrote:

Dear Support,
the answer seems to be simple, but it also seems to be wrong!
Below you can see the description how SLURM_OVERCOMMIT should operate.

SLURM_CPUS_PER_TASK
 (default is 1) allows you to assign multiple CPUs to each
 (multithreaded) process in your job to improve performance. SRUN's
 -c (lowercase) option sets this variable. See the SRUN sections of
 the SLURM Reference Manual <https://computing.llnl.gov/LCdocs/slurm>
 for usage details.
SLURM_OVERCOMMIT
 (default is NO) allows you to assign more than one process per CPU
 (the opposite of the previous variable). SRUN's -O (uppercase)
 option sets this variable, which is /not/ intended to facilitate
 pthreads applications. See the SRUN sections of the SLURM Reference
 Manual <https://computing.llnl.gov/LCdocs/slurm> for usage details.

I don't understand how you can derive from the upper description what you have written below! Not one slot/node is allowed, but more than one process per CPU(slot) is allowed!!!

Remark: In version 1.2.8 SLURM_OVERCOMMIT=1 did not work wrong!

Sincerly yours

H. Häfner

>>>>>>>>>
The answer is simple: the SLURM environmental variables when you set SLURM_OVERCOMMIT=1 are telling us that only one slot/node is available for your use. This is done by the SLURM_TASKS_PER_NODE envar.

So we can only launch 1 proc/node as this is all SLURM is allowing us to do.

Ralph
>>>>>>>>>


On Mar 25, 2009, at 11:00 AM, Hartmut Häfner wrote:

Dear Support,
there is a problem with OpenMPI in version 1.3 and version 1.3.1 when using our batch system Slurm. On our parallel computer there are 2 queues - one with exclusive usage of slots (cores) (SLURM_OVERCOMMIT=0) within nodes and one without shared usage of slots (SLURM_OVERCOMMIT=1) within nodes. Running a simple MPI- program (I'll send you this program mpi_hello.c as attachment) with SLURM_OVERCOMMIT set to 0 the executable works fine, running it with SLURM_OVERCOMMIT set to 1 it does not work correctly. Please have a look to 2 files with outputs. Working not correctly means that the MPI-program runs on 1 processor although I have started it (for example) on 4 processors (it does not work correctly for any processor number not equal to 1).

This error only occurs for the version 1.3 and 1.3.1. If I am using oder versions of OpenMPI the program works fine.

In the file Job_101442.out the hostlist (4x icn001) from Slurm is printed, then the content of the file /scratch/JMS_tmpdir/ Job_101442/tmp.CCaCM22772 is printed, then the commandline (mpirun ...) is printed, then stdout of the program mpi_hello.c is printed (it runs only on 1 processor!!!) and the environment is printed.

In the file Job_101440.out the same program is run. The only difference is, that SLURM_OVERCOMMIT is'nt set!

Under the hood of job_submit .... salloc -n4 script is started. In "script" you find the command
mpirun --hostfile .....  as you can see in both output files.

Sincerly yours

H. Häfner

--
Hartmut Häfner
Karlsruhe Institute of Technology (KIT)
University Karlsruhe (TH)
Steinbuch Centre for Computing (SCC)
Scientific Computing and Applications (SCA)
Zirkel 2 (Campus Süd, Geb. 20.21, Raum 204)
D-76128 Karlsruhe

Fon +49(0)721 608 4869
Fax +49(0)721 32550 hartmut.haef...@kit.edu

http://www.rz.uni-karlsruhe.de/personen/hartmut.haefner

------------------------------------------------------------------------

Betreff:
Re: [OMPI devel] Error in the versions 1.3 and 1.3.1 of OpenMPI when using SLURM_OVERCOMMIT=1
Von:
Ralph Castain <r...@lanl.gov>
Datum:
Wed, 25 Mar 2009 11:14:36 -0600

An:
Open MPI Developers <de...@open-mpi.org>


The answer is simple: the SLURM environmental variables when you set SLURM_OVERCOMMIT=1 are telling us that only one slot/node is available for your use. This is done by the SLURM_TASKS_PER_NODE envar.

So we can only launch 1 proc/node as this is all SLURM is allowing us to do.

Ralph


On Mar 25, 2009, at 11:00 AM, Hartmut Häfner wrote:

Dear Support,
there is a problem with OpenMPI in version 1.3 and version 1.3.1 when using our batch system Slurm. On our parallel computer there are 2 queues - one with exclusive usage of slots (cores) (SLURM_OVERCOMMIT=0) within nodes and one without shared usage of slots (SLURM_OVERCOMMIT=1) within nodes. Running a simple MPI- program (I'll send you this program mpi_hello.c as attachment) with SLURM_OVERCOMMIT set to 0 the executable works fine, running it with SLURM_OVERCOMMIT set to 1 it does not work correctly. Please have a look to 2 files with outputs. Working not correctly means that the MPI-program runs on 1 processor although I have started it (for example) on 4 processors (it does not work correctly for any processor number not equal to 1).

This error only occurs for the version 1.3 and 1.3.1. If I am using oder versions of OpenMPI the program works fine.

In the file Job_101442.out the hostlist (4x icn001) from Slurm is printed, then the content of the file /scratch/JMS_tmpdir/ Job_101442/tmp.CCaCM22772 is printed, then the commandline (mpirun ...) is printed, then stdout of the program mpi_hello.c is printed (it runs only on 1 processor!!!) and the environment is printed.

In the file Job_101440.out the same program is run. The only difference is, that SLURM_OVERCOMMIT is'nt set!

Under the hood of job_submit .... salloc -n4 script is started. In "script" you find the command
mpirun --hostfile .....  as you can see in both output files.

Sincerly yours

H. Häfner

--
Hartmut Häfner
Karlsruhe Institute of Technology (KIT)
University Karlsruhe (TH)
Steinbuch Centre for Computing (SCC)
Scientific Computing and Applications (SCA)
Zirkel 2 (Campus Süd, Geb. 20.21, Raum 204)
D-76128 Karlsruhe

Fon +49(0)721 608 4869
Fax +49(0)721 32550 hartmut.haef...@kit.edu

http://www.rz.uni-karlsruhe.de/personen/hartmut.haefner
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


Reply via email to