Michael,

Is that really true? I thought that and have experienced that, given an MPI program, srun should start n tasks/ranks, not n separate copies.

I agree that for a Non-MPI program like hostname it will start n copies. I also agree that calling srun and mpirun on the same line will start n copies of mpirun which is probably not what anybody wants. I think that the problem is when srun ./mpitest (given that mpitest is an MPI executable) starts 2 copies of mpitest each with it's own MPI_COMM_WORLD which is seen with multiple rank 0 lines.

I believe I have seen that behavior before, but I don't recall if the problem was with Slurm or my MPI build, or possible that the correct PMI wasn't detected for some reason. I think that reading this page https://slurm.schedmd.com/mpi_guide.html closely helped.

Replacing srun with mpirun may work and may be a satisfactory work around, but (correct me if I'm wrong) srun should work if all the components are compiled and linked with the correct libraries. I am still a little unclear about all of the pros and cons of doing it with srun vs.mpirun.

Mike Robbert


On 1/23/17 6:15 PM, Michael Gutteridge wrote:
Re: [slurm-dev] Re: strange srun problem

I think this is what Paddy was getting at above: the `-n` argument to `srun` will run `-n` copies of the program indicated on the command line. This is how `srun` was designed and isn't a bug or problem.

I guess the question is what you're trying to accomplish. From appearances, you want to run an MPI program on the command line interactively. For this I'd say `salloc` as you've done it is the way to go. As rhc indicates above, if you have MPI built with PMI you don't need to tell `mpirun` what to do, it will detect the number and location of assigned nodes:

$ salloc -n 4 -N 4
salloc: Pending job allocation 46557547
salloc: job 46557547 queued and waiting for resources
salloc: job 46557547 has been allocated resources
salloc: Granted job allocation 46557547
$ srun hostname
node114
node311
node310
node312
$ mpirun bin/helloMPI
Hello world from processor node114, rank 0 out of 4 processors
Hello world from processor node312, rank 3 out of 4 processors
Hello world from processor node310, rank 1 out of 4 processors
Hello world from processor node311, rank 2 out of 4 processors

​HTH​


On Mon, Jan 23, 2017 at 1:36 PM, r...@open-mpi.org <mailto:r...@open-mpi.org> <r...@open-mpi.org <mailto:r...@open-mpi.org>> wrote:


    Note that 16.05 contains support for PMIx, so if you are using
    OMPI 2.0 or above, you should ensure that the slurm PMIx support
    is configured “on” and use that for srun (I believe you have to
    tell srun the pmi version to use, so perhaps “srun -mpi=pmix”?)


    > On Jan 23, 2017, at 7:10 AM, TO_Webmaster <luftha...@gmail.com
    <mailto:luftha...@gmail.com>> wrote:
    >
    >
    > Is this OpenMPI? We experienced similar behaviour with OpenMPI. This
    > was fixed after recompiling OpenMPI with PMI, i.e.
    >
    > ./configure [...] --with-pmi=/path/to/slurm [...]
    >
    > 2017-01-23 14:22 GMT+01:00 liu junjun <ljjl...@gmail.com
    <mailto:ljjl...@gmail.com>>:
    >> Hi Paddy,
    >>
    >> Thanks a lot for you kind helps!
    >>
    >> Replacing mpirun by srun seems still not working. Here's how I did:
    >>> cat a.sh
    >> #!/bin/bash
    >> srun ./mpitest
    >>> sbatch -n 2 ./a.sh
    >> Submitted batch job 3611
    >>> cat slurm-3611.out
    >> Hello world: rank 0 of 1 running on cdd001
    >> Hello world: rank 0 of 1 running on cdd001
    >>
    >> So, srun just executed the program twice, instead of running it
    as parallel.
    >>
    >> If I replace the srun back to mpirun, the good output is produced:
    >>> cat a.sh
    >> #!/bin/bash
    >> mpirun ./mpitest
    >>> sbatch a.sh
    >> Submitted batch job 3612
    >>> cat slurm-3612.out
    >> Hello world: rank 0 of 2 running on cdd001
    >> Hello world: rank 1 of 2 running on cdd001
    >>
    >> I also tried using srun inside bash script for serial program:
    >>> cat a.sh
    >> #!/bin/bash
    >> srun echo Hello
    >>> sbatch -n 2 ./a.sh
    >> Submitted batch job 3614
    >>> cat slurm-3614.out
    >> Hello
    >> Hello
    >>
    >> Any idea?
    >>
    >> Thanks in advance!
    >>
    >> Junjun
    >>
    >>
    >>
    >> On Mon, Jan 23, 2017 at 6:16 PM, Paddy Doyle
    <pa...@tchpc.tcd.ie <mailto:pa...@tchpc.tcd.ie>> wrote:
    >>>
    >>>
    >>> Hi Junjun,
    >>>
    >>> On Mon, Jan 23, 2017 at 12:04:17AM -0800, liu junjun wrote:
    >>>
    >>>> Hi all,
    >>>>
    >>>> I have small MPI test program just printing the rannk id of a
    parallel
    >>>> job.
    >>>> The output is like this:
    >>>>> mpirun -n 2 ./mpitest
    >>>> Hello world: rank 0 of 2 running on cddlogin
    >>>> Hello world: rank 1 of 2 running on cddlogin
    >>>>
    >>>> I ran this test program with salloc. It produces similar output:
    >>>>> salloc -n 2
    >>>> salloc: Granted job allocation 3605
    >>>>> mpirun -n 2 ./mpitest
    >>>> Hello world: rank 0 of 2 running on cdd001
    >>>> Hello world: rank 1 of 2 running on cdd001
    >>>>
    >>>> I put this one line command into a bash script for running
    with sbatch.
    >>>> It
    >>>> also get the same result as expected. However, it is totally
    different
    >>>> if
    >>>> it run with srun:
    >>>>> srun -n 2 mpirun -n 2 ./mpitest
    >>>> Hello world: rank 0 of 2 running on cdd001
    >>>> Hello world: rank 1 of 2 running on cdd001
    >>>> Hello world: rank 0 of 2 running on cdd001
    >>>> Hello world: rank 1 of 2 running on cdd001
    >>>
    >>> That looks like expected behaviour from calling both srun and
    mpirun; have
    >>> never
    >>> tried it, but it looks like what might happen if you call them
    both.
    >>>
    >>> But it's not recommended to run your code like that.
    >>>
    >>> I think basically don't call both srun and mpirun! In your
    sbatch either
    >>> put:
    >>>
    >>>  #SBATCH -n 2
    >>>  ....
    >>>  mpirun ./mpitest
    >>>
    >>>
    >>> ..or:
    >>>
    >>>
    >>>  #SBATCH -n 2
    >>>  ....
    >>>  srun ./mpitest
    >>>
    >>>
    >>> You don't need both. And it's simpler not to repeat the '-n 2'
    again in
    >>> the
    >>> mpirun/srun line, as it will lead to copy/paste errors when
    you change it
    >>> in the
    >>> '#SBATCH' line but not below.
    >>>
    >>>> The test program was invoked twice ($SLURM_NTASKS) with each
    time asked
    >>>> 2
    >>>> ($SLURM_NTASKS) CPU for mpi program!!
    >>>
    >>> Yes.
    >>>
    >>>> The problem of srun is actually not about mpi:
    >>>>> srun -n 2 echo "Hello"
    >>>> Hello
    >>>> Hello
    >>>>
    >>>> How can I resolve the problem of srun, and let it behaves
    like sbatch or
    >>>> salloc, where the program executed only one time?
    >>>>
    >>>> The version of slurm is 16.05.3, and
    >>>
    >>> Thanks,
    >>> Paddy
    >>>
    >>> --
    >>> Paddy Doyle
    >>> Trinity Centre for High Performance Computing,
    >>> Lloyd Building, Trinity College Dublin, Dublin 2, Ireland.
    >>> Phone: +353-1-896-3725
    >>> http://www.tchpc.tcd.ie/
    >>
    >>



Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to