Re: [OMPI users] Disable network interface selection

2018-07-01 Thread Gilles Gouaillardet

Carlos,


Open MPI 3.0.2 has been released, and it contains several bug fixes, so I do

encourage you to upgrade and try again.



if it still does not work, can you please run

mpirun --mca oob_base_verbose 10 ...

and then compress and post the output ?


out of curiosity, would

mpirun --mca routed_radix 1 ...

work in your environment ?


once we can analyze the logs, we should be able to figure out what is 
going wrong.



Cheers,

Gilles

On 6/29/2018 4:10 AM, carlos aguni wrote:

Just realized my email wasn't sent to the archive.

On Sat, Jun 23, 2018 at 5:34 PM, carlos aguni > wrote:


Hi!

Thank you all for your reply Jeff, Gilles and rhc.

Thank you Jeff and rhc for clarifying to me some of the openmpi's
internals.

>> FWIW: we never send interface names to other hosts - just dot
addresses
> Should have clarified - when you specify an interface name for the
MCA param, then it is the interface name that is transferred as
that is the value of the MCA param. However, once we determine our
address, we only transfer dot addresses between ourselves

If only dot addresses are sent to the hosts then why doesn't
openmpi use the default route like `ip route get `
instead of choosing a random one? Is it an expected behaviour? Can
it be changed?

Sorry. As Gilles pointed out I forgot to mention which openmpi
version I was using. I'm using openmpi 3.0.0 gcc 7.3.0 from
openhpc. Centos 7.5.

> mpirun—mca oob_tcp_if_exclude192.168.100.0/24
...

I cannot just exclude that interface cause after that I want to
add another computer that's on a different network. And this is
where things get messy :( I cannot just include and exclude
networks cause I have different machines on different networks.
This is what I want to achieve:




compute01



compute02



compute03

ens3



192.168.100.104/24 



10.0.0.227/24 



192.168.100.105/24 

ens8



10.0.0.228/24 



172.21.1.128/24 



---

ens9



172.21.1.155/24 



---



---


So I'm in compute01 MPI_spawning another process on compute02 and
compute03.
With both MPI_Spawn and `mpirun -n 3 -host
compute01,compute02,compute03 hostname`

Then when I include the mca parameters I get this:
`mpirun --oversubscribe --allow-run-as-root -n 3 --mca
oob_tcp_if_include 10.0.0.0/24,192.168.100.0/24
 -host
compute01,compute02,compute03 hostname`
WARNING: An invalid value was given for oob_tcp_if_include. This
value will be ignored.
...
Message:    Did not find interface matching this subnet

This would all work if it were to use the system's internals like
`ip route`.

Best regards,
Carlos.




___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] [EXTERNAL] Re: OpenMPI 3.1.0 Lock Up on POWER9 w/ CUDA9.2

2018-07-01 Thread Hammond, Simon David via users
Nathan,

Same issue with OpenMPI 3.1.1 on POWER9 with GCC 7.2.0 and CUDA9.2.

S.

-- 
Si Hammond
Scalable Computer Architectures
Sandia National Laboratories, NM, USA
[Sent from remote connection, excuse typos]
 

On 6/16/18, 10:10 PM, "Nathan Hjelm"  wrote:

Try the latest nightly tarball for v3.1.x. Should be fixed. 

> On Jun 16, 2018, at 5:48 PM, Hammond, Simon David via users 
 wrote:
> 
> The output from the test in question is:
> 
> Single thread test. Time: 0 s 10182 us 10 nsec/poppush
> Atomics thread finished. Time: 0 s 169028 us 169 nsec/poppush
> 
> 
> S.
> 
> -- 
> Si Hammond
> Scalable Computer Architectures
> Sandia National Laboratories, NM, USA
> [Sent from remote connection, excuse typos]
> 
> 
> On 6/16/18, 5:45 PM, "Hammond, Simon David"  wrote:
> 
>Hi OpenMPI Team,
> 
>We have recently updated an install of OpenMPI on POWER9 system 
(configuration details below). We migrated from OpenMPI 2.1 to OpenMPI 3.1. We 
seem to have a symptom where code than ran before is now locking up and making 
no progress, getting stuck in wait-all operations. While I think it's prudent 
for us to root cause this a little more, I have gone back and rebuilt MPI and 
re-run the "make check" tests. The opal_fifo test appears to hang forever. I am 
not sure if this is the cause of our issue but wanted to report that we are 
seeing this on our system.
> 
>OpenMPI 3.1.0 Configuration:
> 
>./configure 
--prefix=/home/projects/ppc64le-pwr9-nvidia/openmpi/3.1.0-nomxm/gcc/7.2.0/cuda/9.2.88
 --with-cuda=$CUDA_ROOT --enable-mpi-java --enable-java 
--with-lsf=/opt/lsf/10.1 
--with-lsf-libdir=/opt/lsf/10.1/linux3.10-glibc2.17-ppc64le/lib --with-verbs
> 
>GCC versions are 7.2.0, built by our team. CUDA is 9.2.88 from NVIDIA 
for POWER9 (standard download from their website). We enable IBM's JDK 8.0.0.
>RedHat: Red Hat Enterprise Linux Server release 7.5 (Maipo)
> 
>Output:
> 
>make[3]: Entering directory 
`/home/sdhammo/openmpi/openmpi-3.1.0/test/class'
>make[4]: Entering directory 
`/home/sdhammo/openmpi/openmpi-3.1.0/test/class'
>PASS: ompi_rb_tree
>PASS: opal_bitmap
>PASS: opal_hash_table
>PASS: opal_proc_table
>PASS: opal_tree
>PASS: opal_list
>PASS: opal_value_array
>PASS: opal_pointer_array
>PASS: opal_lifo
>
> 
>Output from Top:
> 
>20   0   73280   4224   2560 S 800.0  0.0  17:22.94 lt-opal_fifo
> 
>-- 
>Si Hammond
>Scalable Computer Architectures
>Sandia National Laboratories, NM, USA
>[Sent from remote connection, excuse typos]
> 
> 
> 
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users


___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] Disable network interface selection

2018-07-01 Thread carlos aguni
Just realized my email wasn't sent to the archive.

On Sat, Jun 23, 2018 at 5:34 PM, carlos aguni  wrote:

> Hi!
>
> Thank you all for your reply Jeff, Gilles and rhc.
>
> Thank you Jeff and rhc for clarifying to me some of the openmpi's
> internals.
>
> >> FWIW: we never send interface names to other hosts - just dot addresses
> > Should have clarified - when you specify an interface name for the MCA
> param, then it is the interface name that is transferred as that is the
> value of the MCA param. However, once we determine our address, we only
> transfer dot addresses between ourselves
>
> If only dot addresses are sent to the hosts then why doesn't openmpi use
> the default route like `ip route get ` instead of choosing a
> random one? Is it an expected behaviour? Can it be changed?
>
> Sorry. As Gilles pointed out I forgot to mention which openmpi version I
> was using. I'm using openmpi 3.0.0 gcc 7.3.0 from openhpc. Centos 7.5.
>
> > mpirun—mca oob_tcp_if_exclude 192.168.100.0/24 ...
>
> I cannot just exclude that interface cause after that I want to add
> another computer that's on a different network. And this is where things
> get messy :( I cannot just include and exclude networks cause I have
> different machines on different networks.
> This is what I want to achieve:
>
>
> compute01
>
> compute02
>
> compute03
>
> ens3
>
> 192.168.100.104/24
>
> 10.0.0.227/24
>
> 192.168.100.105/24
>
> ens8
>
> 10.0.0.228/24
>
> 172.21.1.128/24
>
> ---
>
> ens9
>
> 172.21.1.155/24
>
> ---
>
> ---
>
> So I'm in compute01 MPI_spawning another process on compute02 and
> compute03.
> With both MPI_Spawn and `mpirun -n 3 -host compute01,compute02,compute03
> hostname`
>
> Then when I include the mca parameters I get this:
> `mpirun --oversubscribe --allow-run-as-root -n 3 --mca oob_tcp_if_include
> 10.0.0.0/24,192.168.100.0/24 -host compute01,compute02,compute03 hostname`
> WARNING: An invalid value was given for oob_tcp_if_include.  This value
> will be ignored.
> ...
> Message:Did not find interface matching this subnet
>
> This would all work if it were to use the system's internals like `ip
> route`.
>
> Best regards,
> Carlos.
>
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] Disable network interface selection

2018-07-01 Thread carlos aguni
Hi!

Thank you all for your reply Jeff, Gilles and rhc.

Thank you Jeff and rhc for clarifying to me some of the openmpi's internals.

>> FWIW: we never send interface names to other hosts - just dot addresses
> Should have clarified - when you specify an interface name for the MCA
param, then it is the interface name that is transferred as that is the
value of the MCA param. However, once we determine our address, we only
transfer dot addresses between ourselves

If only dot addresses are sent to the hosts then why doesn't openmpi use
the default route like `ip route get ` instead of choosing a
random one? Is it an expected behaviour? Can it be changed?

Sorry. As Gilles pointed out I forgot to mention which openmpi version I
was using. I'm using openmpi 3.0.0 gcc 7.3.0 from openhpc. Centos 7.5.

> mpirun—mca oob_tcp_if_exclude 192.168.100.0/24 ...

I cannot just exclude that interface cause after that I want to add another
computer that's on a different network. And this is where things get messy
:( I cannot just include and exclude networks cause I have different
machines on different networks.
This is what I want to achieve:


compute01

compute02

compute03

ens3

192.168.100.104/24

10.0.0.227/24

192.168.100.105/24

ens8

10.0.0.228/24

172.21.1.128/24

---

ens9

172.21.1.155/24

---

---

So I'm in compute01 MPI_spawning another process on compute02 and compute03.
With both MPI_Spawn and `mpirun -n 3 -host compute01,compute02,compute03
hostname`

Then when I include the mca parameters I get this:
`mpirun --oversubscribe --allow-run-as-root -n 3 --mca oob_tcp_if_include
10.0.0.0/24,192.168.100.0/24 -host compute01,compute02,compute03 hostname`
WARNING: An invalid value was given for oob_tcp_if_include.  This value
will be ignored.
...
Message:Did not find interface matching this subnet

This would all work if it were to use the system's internals like `ip
route`.

Best regards,
Carlos.

On Sat, Jun 23, 2018 at 12:27 AM, r...@open-mpi.org  wrote:

>
>
> On Jun 22, 2018, at 8:25 PM, r...@open-mpi.org wrote:
>
>
>
> On Jun 22, 2018, at 7:31 PM, Gilles Gouaillardet <
> gilles.gouaillar...@gmail.com> wrote:
>
> Carlos,
>
> By any chance, could
>
> mpirun—mca oob_tcp_if_exclude 192.168.100.0/24 ...
>
> work for you ?
>
> Which Open MPI version are you running ?
>
>
> IIRC, subnets are internally translated to interfaces, so that might be an
> issue if
> the translation if made on the first host, and then the interface name is
> sent to the other hosts.
>
>
> FWIW: we never send interface names to other hosts - just dot addresses
>
>
> Should have clarified - when you specify an interface name for the MCA
> param, then it is the interface name that is transferred as that is the
> value of the MCA param. However, once we determine our address, we only
> transfer dot addresses between ourselves
>
>
>
>
> Cheers,
>
> Gilles
>
> On Saturday, June 23, 2018, carlos aguni  wrote:
>
>> Hi all,
>>
>> I'm trying to run a code on 2 machines that has at least 2 network
>> interfaces in it.
>> So I have them as described below:
>>
>> compute01
>> compute02
>> ens3
>> 192.168.100.104/24
>> 10.0.0.227/24
>> ens8
>> 10.0.0.228/24
>> 172.21.1.128/24
>> ens9
>> 172.21.1.155/24
>> ---
>>
>> Issue is. When I execute `mpirun -n 2 -host compute01,compute02 hostname`
>> on them what I get is the correct output after a very long delay..
>>
>> What I've read so far is that OpenMPI performs a greedy algorithm on each
>> interface that timeouts if it doesn't find the desired IP.
>> Then I saw here (https://www.open-mpi.org/faq/?category=tcp#tcp-selection)
>> that I can run commands like:
>> `$ mpirun -n 2 --mca oob_tcp_if_include 10.0.0.0/24 -n 2 -host
>> compute01,compute02 hosname`
>> But this configuration doesn't reach the other host(s).
>> In the end I sometimes I get the same timeout.
>>
>> So is there a way to let it to use the system's default route?
>>
>> Regards,
>> Carlos.
>>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
>
>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
>
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] OpenMPI + PMIx + SLURM

2018-07-01 Thread Charles A Taylor
Just wanted to follow up on my own post.

Turns out there was a missing symlink (much embarrassment) on by build host.   
That’s why you don’t see “pmix_v1” in the “srun —mpi=list” output (previous 
post).
Once I fixed that and rebuilt SLURM, I was able to launch existing OpenMPI 3.x 
apps with,

  srun —mpi=pmix_v1

Apologies for the wasted bandwidth.

Regards,

Charlie

> On Jun 28, 2018, at 8:14 AM, Charles A Taylor  wrote:
> 
> There is a name for my pain and it is “OpenMPI + PMIx”.  :)
> 
> I’m looking at upgrading SLURM from 16.05.11 to 17.11.05 (bear with me, this 
> is not a SLURM question).
> 
> After building SLURM 17.11.05 with 
> ‘--with-pmix=/opt/pmix/1.1.5:/opt/pmix/2.1/1’ and installing a test instance, 
> I see
> 
> $ srun --mpi=list
> srun: MPI types are...
> srun: pmix
> srun: pmi2
> srun: pmix_v2
> srun: none
> srun: openmpi
> 
> Seems reasonable.
> 
> Now, we have applications built with OpenMPI 3.0.0 and 3.1.0 linked against 
> /opt/pmix/1.1.5 (--with-pmix=/opt/pmix/1.1.5).  When I attempt to launch 
> these applications using,
> 
>  srun —mpi=pmix 
> 
> I get the following ...
> 
> [c1a-s18.ufhpc:17995] Security mode none is not available
> [c1a-s18.ufhpc:17995] PMIX ERROR: UNREACHABLE in file 
> src/client/pmix_client.c at line 199
> --
> The application appears to have been direct launched using "srun",
> but OMPI was not built with SLURM's PMI support and therefore cannot
> execute. There are several options for building PMI support under
> SLURM, depending upon the SLURM version you are using:
> 
>  version 16.05 or later: you can use SLURM's PMIx support. This
>  requires that you configure and build SLURM --with-pmix.
> 
>  Versions earlier than 16.05: you must use either SLURM's PMI-1 or
>  PMI-2 support. SLURM builds PMI-1 by default, or you can manually
>  install PMI-2. You must then build Open MPI using --with-pmi pointing
>  to the SLURM PMI library location.
> 
> Please configure as appropriate and try again.
> --
> *** An error occurred in MPI_Init
> *** on a NULL communicator
> *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
> ***and potentially your MPI job)
> ———
> 
> So slurm/srun appear to have library support for both pmix and pmix_v2 and 
> OpenMPI 3.0.0 and OpenMPI 3.1.0 both have pmix support (1.1.5) since we 
> launch them every day with “srun —mpi=pmix” under slurm 16.05.11.
> 
> Is this a bug?   Am I overlooking something?  Is it possible to transition to 
> OpenMPI 3.x + PMIx 2.x + SLURM 17.x without rebuilding (essentially) 
> everything (including all applications)?
> 
> Charlie Taylor
> UF Research Computing
> 

___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

[OMPI users] OpenMPI + PMIx + SLURM

2018-07-01 Thread Charles A Taylor
There is a name for my pain and it is “OpenMPI + PMIx”.  :)

I’m looking at upgrading SLURM from 16.05.11 to 17.11.05 (bear with me, this is 
not a SLURM question).

After building SLURM 17.11.05 with 
‘--with-pmix=/opt/pmix/1.1.5:/opt/pmix/2.1/1’ and installing a test instance, I 
see

$ srun --mpi=list
srun: MPI types are...
srun: pmix
srun: pmi2
srun: pmix_v2
srun: none
srun: openmpi

Seems reasonable.

Now, we have applications built with OpenMPI 3.0.0 and 3.1.0 linked against 
/opt/pmix/1.1.5 (--with-pmix=/opt/pmix/1.1.5).  When I attempt to launch these 
applications using,

  srun —mpi=pmix 

I get the following ...

[c1a-s18.ufhpc:17995] Security mode none is not available
[c1a-s18.ufhpc:17995] PMIX ERROR: UNREACHABLE in file src/client/pmix_client.c 
at line 199
--
The application appears to have been direct launched using "srun",
but OMPI was not built with SLURM's PMI support and therefore cannot
execute. There are several options for building PMI support under
SLURM, depending upon the SLURM version you are using:

  version 16.05 or later: you can use SLURM's PMIx support. This
  requires that you configure and build SLURM --with-pmix.

  Versions earlier than 16.05: you must use either SLURM's PMI-1 or
  PMI-2 support. SLURM builds PMI-1 by default, or you can manually
  install PMI-2. You must then build Open MPI using --with-pmi pointing
  to the SLURM PMI library location.

Please configure as appropriate and try again.
--
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***and potentially your MPI job)
———

So slurm/srun appear to have library support for both pmix and pmix_v2 and 
OpenMPI 3.0.0 and OpenMPI 3.1.0 both have pmix support (1.1.5) since we launch 
them every day with “srun —mpi=pmix” under slurm 16.05.11.

Is this a bug?   Am I overlooking something?  Is it possible to transition to 
OpenMPI 3.x + PMIx 2.x + SLURM 17.x without rebuilding (essentially) everything 
(including all applications)?

Charlie Taylor
UF Research Computing

___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] Enforcing specific interface and subnet usage

2018-07-01 Thread Maksym Planeta
Sorry for late response. But I just wanted to inform you that I found 
another workaround, unrelated to the method we discussed here.


On 19/06/18 15:26, r...@open-mpi.org wrote:

The OMPI cmd line converts "--mca ptl_tcp_remote_connections 1” to OMPI_MCA_ 
ptl_tcp_remote_connections, which is not recognized by PMIx. PMIx is looking for 
PMIX_MCA_ptl_tcp_remote_connections. The only way to set PMIx MCA params for the 
code embedded in OMPI is to put them in your environment



On Jun 19, 2018, at 2:08 AM, Maksym Planeta  
wrote:

But what about remote connections parameter? Why is it not set?

On 19/06/18 00:58, r...@open-mpi.org wrote:

I’m not entirely sure I understand what you are trying to do. The 
PMIX_SERVER_URI2 envar tells local clients how to connect to their local PMIx 
server (i.e., the OMPI daemon on that node). This is always done over the 
loopback device since it is a purely local connection that is never used for 
MPI messages.
I’m sure that the tcp/btl is using your indicated subnet as that would be used 
for internode messages.

--
Regards,
Maksym Planeta

___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users



--
Regards,
Maksym Planeta
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] Disable network interface selection

2018-07-01 Thread carlos aguni
Hi!

Thank you all for your reply Jeff, Gilles and rhc.

Thank you Jeff and rhc for clarifying to me some of the openmpi's internals.

>> FWIW: we never send interface names to other hosts - just dot addresses
> Should have clarified - when you specify an interface name for the MCA
param, then it is the interface name that is transferred as that is the
value of the MCA param. However, once we determine our address, we only
transfer dot addresses between ourselves

If only dot addresses are sent to the hosts then why doesn't openmpi use
the default route like `ip route get ` instead of choosing a
random one? Is it an expected behaviour? Can it be changed?

Sorry. As Gilles pointed out I forgot to mention which openmpi version I
was using. I'm using openmpi 3.0.0 gcc 7.3.0 from openhpc. Centos 7.5.

> mpirun—mca oob_tcp_if_exclude 192.168.100.0/24 ...

I cannot just exclude that interface cause after that I want to add another
computer that's on a different network. And this is where things get messy
:( I cannot just include and exclude networks cause I have different
machines on different networks.
This is what I want to achieve:


compute01

compute02

compute03

ens3

192.168.100.104/24

10.0.0.227/24

192.168.100.105/24

ens8

10.0.0.228/24

172.21.1.128/24

---

ens9

172.21.1.155/24

---

---

So I'm in compute01 MPI_spawning another process on compute02 and compute03.
With both MPI_Spawn and `mpirun -n 3 -host compute01,compute02,compute03
hostname`

Then when I include the mca parameters I get this:
`mpirun --oversubscribe --allow-run-as-root -n 3 --mca oob_tcp_if_include
10.0.0.0/24,192.168.100.0/24 -host compute01,compute02,compute03 hostname`
WARNING: An invalid value was given for oob_tcp_if_include.  This value
will be ignored.
...
Message:Did not find interface matching this subnet

This would all work if it were to use the system's internals like `ip
route`.

Best regards,
Carlos.
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

[OMPI users] Error when using Open MPI shared library in a program compiled by g++

2018-07-01 Thread lille stor

Hi,

 

I have successfully built Open MPI version 2.1.3 from scratch in Ubuntu 14.04 64 bit and using GCC 4.9. The result was the following shared libraries (needed for a program to use Open MPI):

        dummy@machine:~/$ ldd /home/dummy/openmpi/build/lib/libmpi.so /home/dummy/openmpi/build/lib/libopen-rte.so /home/dummy/openmpi/build/lib/libopen-pal.so

        /home/dummy/openmpi/build/lib/libmpi.so:
        linux-vdso.so.1 =>  (0x7ffecfb87000)
    libopen-rte.so.20 => not found
    libopen-pal.so.20 => not found
    librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x7fcc99058000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x7fcc98d51000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x7fcc98b33000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x7fcc9876a000)
    /lib64/ld-linux-x86-64.so.2 (0x5574ed2b8000)

    /home/dummy/openmpi/build/lib/libopen-rte.so:
    linux-vdso.so.1 =>  (0x7ffc99743000)
    libopen-pal.so.20 => not found
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x7ff6f498f000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x7ff6f45c6000)
    /lib64/ld-linux-x86-64.so.2 (0x5622db738000)

    /home/dummy/openmpi/build/lib/libopen-pal.so:
    linux-vdso.so.1 =>  (0x7ffdfc374000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x7f46b6bb5000)
    librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x7f46b69ad000)
    libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x7f46b67aa000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x7f46b63e)
    /lib64/ld-linux-x86-64.so.2 (0x55eca0708000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x7f46b61c2000)
 
 
Now, when I try to compile a small program with g++ version 4.9 using Open MPI shared library it fails with an /home/dummy/openmpi/build/lib/libopen-pal.so: undefined reference to pthread_atfork error. The small program is the following:


        #include 
        #include "mpi.h"

        int main(void)
        {
            MPI_Comm comm;
            int rank;

            MPI_Comm_rank(comm, &rank);
            printf("<%d>\n", rank);

            return 0;
        }

 

The compilation statement is:

        g++ test.c -pthread -I/home/dummy/openmpi/build/include -L/home/dummy/openmpi/build/lib -lmpi -lopen-rte -lopen-pal


Any idea why g++ (i.e. C++ compiler) is throwing the aforementioned error? FYI, when I compile the same program with gcc (i.e. C compiler) it works perfectly.

 

Thank you,

L.

___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users