Re: [slurm-users] srun and mpirun

2018-04-13 Thread Chris Samuel
On Saturday, 14 April 2018 1:33:13 AM AEST Mahmood Naderan wrote:

> I tried with one of the NAS benchmarks (BT) with 121 threads since the
> number of cores should be square.

That's an IO benchmark, not going to help you for this.  You need something 
that is compute bound & comms intensive to see differences.

It also sounds like you've got a problem with either your Slurm job or Slurm 
itself from the error you posted.

-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC




Re: [slurm-users] srun and mpirun

2018-04-13 Thread Artem Polyakov
The output is certainly not enough to judge, but my first guess would be
that your MPI (what is it btw?) is not support PMI that is enabled in Slurm.
Note also, that Slurm now supports 3 ways of doing PMI and from the info
that you have provided it is not clear which one you are using.
To judge with a reasonable level of confidence the following info is needed:
* Which MPI implementation is used.
* What version.
* How it was configured.
If you do not provide "--mpi" option to srun you are using PMI1 which is
implemented in Slurm core library. There are other implementations
available:
* PMI2 plugin ("srun --mpi=pmi2")
* PMIx plugin ("srun --mpi=pmix")
Those are more performant options but need some additional work: for PMI2
plugin you need to install the library from contrib/pmi2, for PMIx you need
to build slurm --with-pmix=.

2018-04-13 8:33 GMT-07:00 Mahmood Naderan :

> I tried with one of the NAS benchmarks (BT) with 121 threads since the
> number of cores should be square. With srun, I get
>
>  WARNING: compiled for   121 processes
>  Number of active processes: 1
>
>0   1 408 408 408
>   Problem size too big for compiled array sizes
>
>
>
>
> However, with mpirun, it seems to be fine
>
>  Number of active processes:   121
>
>  Time step1
>  Time step   20
>
>
> Regards,
> Mahmood
>
>
>
>
> On Fri, Apr 13, 2018 at 5:46 PM, Chris Samuel  wrote:
> > On 13/4/18 7:19 pm, Mahmood Naderan wrote:
> >
> >> I see some old posts on the web about performance comparison of srun vs.
> >> mpirun. Is that still an issue?
> >
> >
> > Just running an MPI hello world program is not going to test that.
> >
> > You need to run an actual application that is doing a lot of
> > computation and communications instead.  Something like NAMD
> > or a synthetic benchmark like HPL.
> >
> > All the best,
> > Chris
> > --
> >  Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
> >
>
>


-- 
С Уважением, Поляков Артем Юрьевич
Best regards, Artem Y. Polyakov


Re: [slurm-users] srun and mpirun

2018-04-13 Thread Mahmood Naderan
I tried with one of the NAS benchmarks (BT) with 121 threads since the
number of cores should be square. With srun, I get

 WARNING: compiled for   121 processes
 Number of active processes: 1

   0   1 408 408 408
  Problem size too big for compiled array sizes




However, with mpirun, it seems to be fine

 Number of active processes:   121

 Time step1
 Time step   20


Regards,
Mahmood




On Fri, Apr 13, 2018 at 5:46 PM, Chris Samuel  wrote:
> On 13/4/18 7:19 pm, Mahmood Naderan wrote:
>
>> I see some old posts on the web about performance comparison of srun vs.
>> mpirun. Is that still an issue?
>
>
> Just running an MPI hello world program is not going to test that.
>
> You need to run an actual application that is doing a lot of
> computation and communications instead.  Something like NAMD
> or a synthetic benchmark like HPL.
>
> All the best,
> Chris
> --
>  Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
>



Re: [slurm-users] srun and mpirun

2018-04-13 Thread Chris Samuel

On 13/4/18 7:19 pm, Mahmood Naderan wrote:

I see some old posts on the web about performance comparison of srun 
vs. mpirun. Is that still an issue?


Just running an MPI hello world program is not going to test that.

You need to run an actual application that is doing a lot of
computation and communications instead.  Something like NAMD
or a synthetic benchmark like HPL.

All the best,
Chris
--
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC



Re: [slurm-users] srun and mpirun

2018-04-13 Thread Peter Kjellström
On Fri, 13 Apr 2018 13:49:56 +0430
Mahmood Naderan  wrote:

> Hi,
> I see some old posts on the web about performance comparison of srun
> vs. mpirun. Is that still an issue? Both the following scripts works
> for test programs and surely the performance concerns is not visible
> here.
... 
> #SBATCH --time=10:00
> mpirun mpihello
...
> #SBATCH --time=10:00
> srun mpihello

The difference will depend on the MPI implementation.

For example for some MPIs the mpirun will run ssh (which may end up
placing processes really badly wrt job cgroups etc) while others may
auto-detect slurm and use srun to spawn ranks or proxies.

Then there's very different performance aspects that can be affected
(or at least work differently). For example launch and setup is one
thing but process pinning behavior is another...

It's not a very simple question in the general case.

/Peter K



[slurm-users] srun and mpirun

2018-04-13 Thread Mahmood Naderan
Hi,
I see some old posts on the web about performance comparison of srun
vs. mpirun. Is that still an issue? Both the following scripts works
for test programs and surely the performance concerns is not visible
here.

#!/bin/bash
#SBATCH --job-name=hello_mpi
#SBATCH --output=hellompi.log
#SBATCH --ntasks=120
#SBATCH --time=10:00
mpirun mpihello

and

#!/bin/bash
#SBATCH --job-name=hello_mpi
#SBATCH --output=hellompi.log
#SBATCH --ntasks=120
#SBATCH --time=10:00
srun mpihello


Regards,
Mahmood