[OMPI users] "failed to create queue pair" problem, but settings appear OK

2016-06-16 Thread Sasso, John (GE Power, Non-GE)
Thank-you Nathan.  Since the default btl_openib_receive_queues setting is:

P,128,256,192,128:S,2048,1024,1008,64:S,12288,1024,1008,64:S,65536,1024,1008,64

this would mean that, with max_qp = 392632 and 4 QPs above, the "actual" max 
would be 392632 / 4 = 98158.   Using this value in my prior math, the upper 
bound on the number of 24-core nodes would be  98158 / 24^2 ~ 170.This 
comes closer to the limit I encountered while testing.   I'm sure there are 
other particulars I am not accounting for in this math, but the approximation 
is reasonable.  

Thanks for the clarification, Nathan!

--john

-Original Message-
From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Nathan Hjelm
Sent: Thursday, June 16, 2016 9:56 AM
To: Open MPI Users
Subject: EXT: Re: [OMPI users] "failed to create queue pair" problem, but 
settings appear OK

XRC support is greatly improved in 1.10.x and 2.0.0. Would be interesting to 
see if a newer version fixed the shutdown hang.

When calculating the required number of queue pairs you also have to divide by 
the number of queue pairs in the btl_openib_receive_queues parameter. 
Additionally Open MPI uses 1 qp/rank for connections (1.7+) and there are some 
in use by IPoIB and other services.

-Nathan

> On Jun 16, 2016, at 7:15 AM, Sasso, John (GE Power, Non-GE) 
> <john1.sa...@ge.com> wrote:
> 
> Nathan,
> 
> Thank you for the suggestion.   I tried your btl_openib_receive_queues 
> setting with a 4200+ core IMB job, and the job ran (great!).   The shutdown 
> of the job took such a long time that after 6 minutes, I had to 
> force-terminate the job.
> 
> When I tried using this scheme before, with the following recommended by the 
> OpenMPI FAQ, I got some odd errors:
> 
> --mca btl openib,sm,self --mca btl_openib_receive_queues 
> X,128,256,192,128:X,2048,256,128,32:X,12288,256,128,32:X,65536,256,128
> ,32
> 
> However, when I tried:
> 
> --mca btl openib,sm,self --mca btl_openib_receive_queues 
> X,4096,1024:X,12288,512:X,65536,512
> 
> I got success with my aforementioned job.
> 
> I am going to do more testing, with the goal of getting a 5000 core job to 
> run successfully.  If I can, then down the road my concern is the impact the 
> btl_openib_receive_queues mca parameter (above) will have on lower-scale (< 
> 1024 cores) jobs, since the parameter change to the global openmpi config 
> file would impact ALL users and jobs of all scales.
> 
> Chuck – as I noted in my first email, log_num_mtt was set fine, so that is 
> not the issue here.
> 
> Finally, regarding running out of QPs, I examined the output of ‘ibv_devinfo 
> –v’ on our compute nodes.  I see the following pertinent settings:
> 
> max_qp: 392632
> max_qp_wr:  16351
> max_qp_rd_atom: 16
> max_qp_init_rd_atom:128
> max_cq: 65408
>max_cqe:4194303
> 
> Figuring that max_qp is the prime limitation here I am running into when 
> using the PP and SRQ QPs, considering 12 cores per node, this would seem to 
> imply that an upper bound on the number of nodes would be 392632 / 24^2 ~ 681 
> nodes.  This does not make sense, because I saw the QP creation failure error 
> (again, NO error about failure to register enough memory) for as small as 177 
> 24-core nodes!  I don’t know how to make sense of this, tho I don’t question 
> that we were running out of QPs.
> 
> --john
> 
> 
> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Nathan 
> Hjelm
> Sent: Wednesday, June 15, 2016 2:43 PM
> To: Open MPI Users
> Subject: EXT: Re: [OMPI users] "failed to create queue pair" problem, 
> but settings appear OK
> 
> You ran out of queue pairs. There is no way around this for larger all-to-all 
> transfers when using the openib btl and SRQ. Need O(cores^2) QPs to fully 
> connect with SRQ or PP QPs. I recommend using XRC instead by adding:
> 
> btl_openib_receive_queues = X,4096,1024:X,12288,512:X,65536,512
> 
> 
> to your openmpi-mca-params.conf
> 
> or
> 
> -mca btl_openib_receive_queues X,4096,1024:X,12288,512:X,65536,512
> 
> 
> to the mpirun command line.
> 
> 
> -Nathan
> 
> On Jun 15, 2016, at 12:35 PM, "Sasso, John (GE Power, Non-GE)" 
> <john1.sa...@ge.com> wrote:
> 
> Chuck,
> 
> The per-process limits appear fine, including those for the resource mgr 
> daemons:
> 
> Limit Soft Limit Hard Limit Units
> Max address space unlimited unlimited bytes Max core file size 0 0 
> bytes Max cpu time unlimited unlimited seconds Max data size unlimited 
> unlimited bytes Max file locks unlimited unlimited loc

[OMPI users] "failed to create queue pair" problem, but settings appear OK

2016-06-16 Thread Sasso, John (GE Power, Non-GE)
Nathan,

Thank you for the suggestion.   I tried your btl_openib_receive_queues setting 
with a 4200+ core IMB job, and the job ran (great!).   The shutdown of the job 
took such a long time that after 6 minutes, I had to force-terminate the job.

When I tried using this scheme before, with the following recommended by the 
OpenMPI FAQ, I got some odd errors:

--mca btl openib,sm,self --mca btl_openib_receive_queues 
X,128,256,192,128:X,2048,256,128,32:X,12288,256,128,32:X,65536,256,128,32

However, when I tried:

--mca btl openib,sm,self --mca btl_openib_receive_queues 
X,4096,1024:X,12288,512:X,65536,512

I got success with my aforementioned job.

I am going to do more testing, with the goal of getting a 5000 core job to run 
successfully.  If I can, then down the road my concern is the impact the 
btl_openib_receive_queues mca parameter (above) will have on lower-scale (< 
1024 cores) jobs, since the parameter change to the global openmpi config file 
would impact ALL users and jobs of all scales.

Chuck - as I noted in my first email, log_num_mtt was set fine, so that is not 
the issue here.

Finally, regarding running out of QPs, I examined the output of 'ibv_devinfo 
-v' on our compute nodes.  I see the following pertinent settings:

max_qp: 392632
max_qp_wr:  16351
max_qp_rd_atom: 16
max_qp_init_rd_atom:128
max_cq: 65408
   max_cqe:4194303

Figuring that max_qp is the prime limitation here I am running into when using 
the PP and SRQ QPs, considering 12 cores per node, this would seem to imply 
that an upper bound on the number of nodes would be 392632 / 24^2 ~ 681 nodes.  
This does not make sense, because I saw the QP creation failure error (again, 
NO error about failure to register enough memory) for as small as 177 24-core 
nodes!  I don't know how to make sense of this, tho I don't question that we 
were running out of QPs.

--john


From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Nathan Hjelm
Sent: Wednesday, June 15, 2016 2:43 PM
To: Open MPI Users
Subject: EXT: Re: [OMPI users] "failed to create queue pair" problem, but 
settings appear OK

You ran out of queue pairs. There is no way around this for larger all-to-all 
transfers when using the openib btl and SRQ. Need O(cores^2) QPs to fully 
connect with SRQ or PP QPs. I recommend using XRC instead by adding:


btl_openib_receive_queues = X,4096,1024:X,12288,512:X,65536,512

to your openmpi-mca-params.conf

or

-mca btl_openib_receive_queues X,4096,1024:X,12288,512:X,65536,512


to the mpirun command line.


-Nathan

On Jun 15, 2016, at 12:35 PM, "Sasso, John (GE Power, Non-GE)" 
<john1.sa...@ge.com<mailto:john1.sa...@ge.com>> wrote:
Chuck,

The per-process limits appear fine, including those for the resource mgr 
daemons:

Limit Soft Limit Hard Limit Units
Max address space unlimited unlimited bytes
Max core file size 0 0 bytes
Max cpu time unlimited unlimited seconds
Max data size unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max file size unlimited unlimited bytes
Max locked memory unlimited unlimited bytes
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max open files 16384 16384 files
Max pending signals 515625 515625 signals
Max processes 515625 515625 processes
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
Max resident set unlimited unlimited bytes
Max stack size 30720 unlimited bytes



As for the FAQ re registered memory, checking our OpenMPI settings with 
ompi_info, we have:

mpool_rdma_rcache_size_limit = 0 ==> Open MPI will register as much user memory 
as necessary
btl_openib_free_list_max = -1 ==> Open MPI will try to allocate as many 
registered buffers as it needs
btl_openib_eager_rdma_num = 16
btl_openib_max_eager_rdma = 16
btl_openib_eager_limit = 12288


Other suggestions welcome. Hitting a brick wall here. Thanks!

--john



-Original Message-
From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Gus Correa
Sent: Wednesday, June 15, 2016 1:39 PM
To: Open MPI Users
Subject: EXT: Re: [OMPI users] "failed to create queue pair" problem, but 
settings appear OK

Hi John

1) For diagnostic, you could check the actual "per process" limits on the nodes 
while that big job is running:

cat /proc/$PID/limits

2) If you're using a resource manager to launch the job, the resource manager 
daemon/deamons (local to the nodes) may have to to set the memlock and other 
limits, so that the Open MPI processes inherit them.
I use Torque, so I put these lines in the pbs_mom (Torque local daemon) 
initialization script:

# pbs_mom system limits
# max file descriptors
ulimit -n 32768
# locked memory
ulimit -l unlimited
# stacksize
ulimit -s unlimited

3) See also this FAQ related to registered memory.
I set these parameters in /etc

Re: [OMPI users] "failed to create queue pair" problem, but settings appear OK

2016-06-15 Thread Sasso, John (GE Power, Non-GE)
QUESTION:   Since the error said the system may have run out of queue pairs, 
how do I determine the # of queue pairs the IB HCA can support?


-Original Message-
From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Sasso, John (GE 
Power, Non-GE)
Sent: Wednesday, June 15, 2016 2:35 PM
To: Open MPI Users
Subject: EXT: [OMPI users] "failed to create queue pair" problem, but settings 
appear OK

Chuck, 

The per-process limits appear fine, including those for the resource mgr 
daemons:

Limit Soft Limit   Hard Limit   Units 
Max address space unlimitedunlimitedbytes 
Max core file size00bytes 
Max cpu time  unlimitedunlimitedseconds   
Max data size unlimitedunlimitedbytes 
Max file locksunlimitedunlimitedlocks 
Max file size unlimitedunlimitedbytes 
Max locked memory unlimitedunlimitedbytes 
Max msgqueue size 819200   819200   bytes 
Max nice priority 00
Max open files1638416384files 
Max pending signals   515625   515625   signals   
Max processes 515625   515625   processes 
Max realtime priority 00
Max realtime timeout  unlimitedunlimitedus
Max resident set  unlimitedunlimitedbytes 
Max stack size30720unlimitedbytes   



As for the FAQ re registered memory, checking our OpenMPI settings with 
ompi_info, we have:

mpool_rdma_rcache_size_limit = 0  ==> Open MPI will register as much user 
memory as necessary 
btl_openib_free_list_max =  -1==> Open MPI will try to allocate as many 
registered buffers as it needs
btl_openib_eager_rdma_num = 16 
btl_openib_max_eager_rdma = 16 
btl_openib_eager_limit = 12288   


Other suggestions welcome.   Hitting a brick wall here.  Thanks!

--john



-Original Message-
From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Gus Correa
Sent: Wednesday, June 15, 2016 1:39 PM
To: Open MPI Users
Subject: EXT: Re: [OMPI users] "failed to create queue pair" problem, but 
settings appear OK

Hi John

1) For diagnostic, you could check the actual "per process" limits on the nodes 
while that big job is running:

cat /proc/$PID/limits

2) If you're using a resource manager to launch the job, the resource manager 
daemon/deamons (local to the nodes) may have to to set the memlock and other 
limits, so that the Open MPI processes inherit them.
I use Torque, so I put these lines in the pbs_mom (Torque local daemon) 
initialization script:

# pbs_mom system limits
# max file descriptors
ulimit -n 32768
# locked memory
ulimit -l unlimited
# stacksize
ulimit -s unlimited

3) See also this FAQ related to registered memory.
I set these parameters in /etc/modprobe.d/mlx4_core.conf, but where they're set 
may depend on the Linux distro/release and the OFED you're using.

https://urldefense.proofpoint.com/v2/url?u=https-3A__www.open-2Dmpi.org_faq_-3Fcategory-3Dopenfabrics-23ib-2Dlow-2Dreg-2Dmem=CwIF-g=IV_clAzoPDE253xZdHuilRgztyh_RiV3wUrLrDQYWSI=tqKZ2vRCLufSSXPvzNxBrKr01YPimBPnb-JT-Js0Fmk=fkBwjwn1Rvenp2NGwrQM3JtenpfbO_fxYUSK4lrHnzE=UFQ0uSWQoNPCfwg9q02YzMJczt7g4jEcaCvYOd46RRA=
 

I hope this helps,
Gus Correa

On 06/15/2016 11:05 AM, Sasso, John (GE Power, Non-GE) wrote:
>
> In doing testing with IMB, I find that running a 4200+ core case with 
> the IMB test Alltoall, and message lengths of 16..1024 bytes (as per 
> -msglog 4:10 IMB option), it fails with:
>
> --
> 
>
> A process failed to create a queue pair. This usually means either
>
> the device has run out of queue pairs (too many connections) or
>
> there are insufficient resources available to allocate a queue pair
>
> (out of memory). The latter can happen if either 1) insufficient
>
> memory is available, or 2) no more physical memory can be registered
>
> with the device.
>
> For more information on memory registration see the Open MPI FAQs at:
>
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.open-2Dmpi.org
> _faq_-3Fcategory-3Dopenfabrics-23ib-2Dlocked-2Dpages=CwIF-g=IV_clA
> zoPDE253xZdHuilRgztyh_RiV3wUrLrDQYWSI=tqKZ2vRCLufSSXPvzNxBrKr01YPimB
> Pnb-JT-Js0Fmk=fkBwjwn1Rvenp2NGwrQM3JtenpfbO_fxYUSK4lrHnzE=dKT5yJta
> 2xW_ZUh06x95KTWjE1LgO8NU3OsjbwQsYLc=
>

[OMPI users] "failed to create queue pair" problem, but settings appear OK

2016-06-15 Thread Sasso, John (GE Power, Non-GE)
Chuck, 

The per-process limits appear fine, including those for the resource mgr 
daemons:

Limit Soft Limit   Hard Limit   Units 
Max address space unlimitedunlimitedbytes 
Max core file size00bytes 
Max cpu time  unlimitedunlimitedseconds   
Max data size unlimitedunlimitedbytes 
Max file locksunlimitedunlimitedlocks 
Max file size unlimitedunlimitedbytes 
Max locked memory unlimitedunlimitedbytes 
Max msgqueue size 819200   819200   bytes 
Max nice priority 00
Max open files1638416384files 
Max pending signals   515625   515625   signals   
Max processes 515625   515625   processes 
Max realtime priority 00
Max realtime timeout  unlimitedunlimitedus
Max resident set  unlimitedunlimitedbytes 
Max stack size30720unlimitedbytes   



As for the FAQ re registered memory, checking our OpenMPI settings with 
ompi_info, we have:

mpool_rdma_rcache_size_limit = 0  ==> Open MPI will register as much user 
memory as necessary 
btl_openib_free_list_max =  -1==> Open MPI will try to allocate as many 
registered buffers as it needs
btl_openib_eager_rdma_num = 16 
btl_openib_max_eager_rdma = 16 
btl_openib_eager_limit = 12288   


Other suggestions welcome.   Hitting a brick wall here.  Thanks!

--john



-Original Message-
From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Gus Correa
Sent: Wednesday, June 15, 2016 1:39 PM
To: Open MPI Users
Subject: EXT: Re: [OMPI users] "failed to create queue pair" problem, but 
settings appear OK

Hi John

1) For diagnostic, you could check the actual "per process" limits on the nodes 
while that big job is running:

cat /proc/$PID/limits

2) If you're using a resource manager to launch the job, the resource manager 
daemon/deamons (local to the nodes) may have to to set the memlock and other 
limits, so that the Open MPI processes inherit them.
I use Torque, so I put these lines in the pbs_mom (Torque local daemon) 
initialization script:

# pbs_mom system limits
# max file descriptors
ulimit -n 32768
# locked memory
ulimit -l unlimited
# stacksize
ulimit -s unlimited

3) See also this FAQ related to registered memory.
I set these parameters in /etc/modprobe.d/mlx4_core.conf, but where they're set 
may depend on the Linux distro/release and the OFED you're using.

https://urldefense.proofpoint.com/v2/url?u=https-3A__www.open-2Dmpi.org_faq_-3Fcategory-3Dopenfabrics-23ib-2Dlow-2Dreg-2Dmem=CwIF-g=IV_clAzoPDE253xZdHuilRgztyh_RiV3wUrLrDQYWSI=tqKZ2vRCLufSSXPvzNxBrKr01YPimBPnb-JT-Js0Fmk=fkBwjwn1Rvenp2NGwrQM3JtenpfbO_fxYUSK4lrHnzE=UFQ0uSWQoNPCfwg9q02YzMJczt7g4jEcaCvYOd46RRA=
 

I hope this helps,
Gus Correa

On 06/15/2016 11:05 AM, Sasso, John (GE Power, Non-GE) wrote:
>
> In doing testing with IMB, I find that running a 4200+ core case with 
> the IMB test Alltoall, and message lengths of 16..1024 bytes (as per 
> -msglog 4:10 IMB option), it fails with:
>
> --
> 
>
> A process failed to create a queue pair. This usually means either
>
> the device has run out of queue pairs (too many connections) or
>
> there are insufficient resources available to allocate a queue pair
>
> (out of memory). The latter can happen if either 1) insufficient
>
> memory is available, or 2) no more physical memory can be registered
>
> with the device.
>
> For more information on memory registration see the Open MPI FAQs at:
>
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.open-2Dmpi.org
> _faq_-3Fcategory-3Dopenfabrics-23ib-2Dlocked-2Dpages=CwIF-g=IV_clA
> zoPDE253xZdHuilRgztyh_RiV3wUrLrDQYWSI=tqKZ2vRCLufSSXPvzNxBrKr01YPimB
> Pnb-JT-Js0Fmk=fkBwjwn1Rvenp2NGwrQM3JtenpfbO_fxYUSK4lrHnzE=dKT5yJta
> 2xW_ZUh06x95KTWjE1LgO8NU3OsjbwQsYLc=
>
> Local host: node7106
>
> Local device:   mlx4_0
>
> Queue pair type:Reliable connected (RC)
>
> --
> 
>
> [node7106][[51922,1],0][connect/btl_openib_connect_oob.c:867:rml_recv_
> cb]
> error in endpoint reply start connect
>
> [node7106:06503] [[51922,0],0]-[[51922,1],0] mca_oob_tcp_msg_recv: 
> readv failed: Connection reset by peer (104)
>
> ---

[OMPI users] "failed to create queue pair" problem, but settings appear OK

2016-06-15 Thread Sasso, John (GE Power, Non-GE)
In doing testing with IMB, I find that running a 4200+ core case with the IMB 
test Alltoall, and message lengths of 16..1024 bytes (as per -msglog 4:10 IMB 
option), it fails with:

--
A process failed to create a queue pair. This usually means either
the device has run out of queue pairs (too many connections) or
there are insufficient resources available to allocate a queue pair
(out of memory). The latter can happen if either 1) insufficient
memory is available, or 2) no more physical memory can be registered
with the device.

For more information on memory registration see the Open MPI FAQs at:
http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages

Local host: node7106
Local device:   mlx4_0
Queue pair type:Reliable connected (RC)
--
[node7106][[51922,1],0][connect/btl_openib_connect_oob.c:867:rml_recv_cb] error 
in endpoint reply start connect
[node7106:06503] [[51922,0],0]-[[51922,1],0] mca_oob_tcp_msg_recv: readv 
failed: Connection reset by peer (104)
--
mpirun has exited due to process rank 0 with PID 6504 on
node node7106 exiting improperly. There are two reasons this could occur:

1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.

2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"

This may have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--

Yes, these are ALL of the error messages.  I did not get a message about not 
being able to register enough memory.   I verified that log_num_mtt = 24  and 
log_mtts_per_seg = 0 (via catting of their files in 
/sys/module/mlx4_core/parameters and what is set in 
/etc/modprobe.d/mlx4_core.conf).  While such a large-scale job runs, I run 
'vmstat 10' to examine memory usage, but there appears to be a good amount of 
memory still available and swap is never used.   In terms of settings in 
/etc/security/limits.conf:

* soft memlock  unlimited
* hard memlock  unlimited
* soft stack 30
* hard stack unlimited

I don't know if btl_openib_connect_oob.c or mca_oob_tcp_msg_recv are clues, but 
I am now at a loss as to where the problem lies.

This is for an application using OpenMPI 1.6.5, and the systems have Mellanox 
OFED 3.1.1 installed.

--john



Re: [OMPI users] Singleton process spawns additional thread

2016-01-07 Thread Sasso, John (GE Power, Non-GE)
Stefan,  I don't know if this is related to your issue, but FYI...


> Those are async progress threads - they block unless something requires doing
>
>
>> On Apr 15, 2015, at 8:36 AM, Sasso, John (GE Power & Water, Non-GE) 
>>  wrote:
>> 
>> I stumbled upon something while using 'ps -eFL' to view threads of 
>> processes, and Google searches have failed to answer my question.  This 
>> question holds for OpenMPI 1.6.x and even OpenMPI 1.4.x.
 >> 
>> For a program which is pure MPI (built and run using OpenMPI) and does not 
>> implement Pthreads or OpenMP, why is it that each MPI task appears as having 
>> 3 threads:
 >>
>> UID  PID  PPID   LWP  C NLWPSZ   RSS PSR STIME TTY  TIME CMD
>> sasso  20512 20493 20512 993 187849 582420 14 11:01 ?   00:26:37 
>> /home/sasso/mpi_example.exe
>> sasso  20512 20493 20588  03 187849 582420 11 11:01 ?   00:00:00 
>> /home/sasso/mpi_example.exe
>> sasso  20512 20493 20599  03 187849 582420 12 11:01 ?   00:00:00 
>> /home/sasso/mpi_example.exe
 >>
>> whereas if I compile and run a non-MPI program, 'ps -eFL' shows it running 
>> as a single thread?
>>
>> Granted the CPU utilization (C) for 2 of the 3 threads is zero, but the 
>> threads are bound to different processors (11,12,14).   I am curious as to 
>> why this is, and no complaining that there is a problem.  Thanks!
 >> 
>> --john



-Original Message-
From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Au Eelis
Sent: Thursday, January 07, 2016 7:10 AM
To: us...@open-mpi.org
Subject: [OMPI users] Singleton process spawns additional thread

Hi!

I have a weird problem with executing a singleton OpenMPI program, where an 
additional thread causes full load, while the master thread performs the actual 
calculations.

In contrast, executing "mpirun -np 1 [executable]" performs the same 
calculation at the same speed but the additional thread is idling.

In my understanding, both calculations should behave in the same way (i.e., one 
working thread) for a program which is simply moving some data around (mainly 
some MPI_BCAST and MPI_GATHER commands).

I could observe this behaviour in OpenMPI 1.10.1 with ifort 16.0.1 and gfortran 
5.3.0. I could create a minimal working example, which is appended to this mail.

Am I missing something?

Best regards,
Stefan

-

MWE: Compile this with "mpifort main.f90". When executing with "./a.out", there 
is thread wasting cycles, while the master thread waits for input. When 
executing with "mpirun -np 1 ./a.out" this thread is idling.

program main
 use mpi_f08
 implicit none

 integer :: ierror,rank

 call MPI_Init(ierror)
 call MPI_Comm_Rank(MPI_Comm_World,rank,ierror)

 ! let master thread wait on [RETURN]-key
 if (rank == 0) then
 read(*,*)
 end if

 write(*,*) rank

 call mpi_barrier(mpi_comm_world, ierror) end program 
___
users mailing list
us...@open-mpi.org
Subscription: 
https://urldefense.proofpoint.com/v2/url?u=http-3A__www.open-2Dmpi.org_mailman_listinfo.cgi_users=CwICAg=IV_clAzoPDE253xZdHuilRgztyh_RiV3wUrLrDQYWSI=tqKZ2vRCLufSSXPvzNxBrKr01YPimBPnb-JT-Js0Fmk=NPeEHKik35WrcHGDl5ZRq4IC6Le5g03o5YoqD9InrHw=eRYTNaknio7tNJFdOMTqvdlNNIq9p6evJoQxuvmqrLs=
Link to this post: 
https://urldefense.proofpoint.com/v2/url?u=http-3A__www.open-2Dmpi.org_community_lists_users_2016_01_28237.php=CwICAg=IV_clAzoPDE253xZdHuilRgztyh_RiV3wUrLrDQYWSI=tqKZ2vRCLufSSXPvzNxBrKr01YPimBPnb-JT-Js0Fmk=NPeEHKik35WrcHGDl5ZRq4IC6Le5g03o5YoqD9InrHw=2_axdls1JH4Wm5MlkOXRrtXFb2LLVLCleKVx4ybpltU=