Hi,

I think I have a bug on the Scientific Linux desktop relating to the
OpenMPI installation. It appears there is a condition which
prevents any MPI application from completing. Bellow you can see perhaps the simplest MPI that can be written and still link to the MPI library:

x.F90

program main

  use mpi

  implicit none

  integer :: ierror

  call mpi_init(ierror)
  call mpi_finalize(ierror)

end program main

I've compiled it with mpif90 -o x x.F90. Compilation goes fine, but when trying 
to execute the resulting ‘x’ file, I received the following error:

libibverbs: Fatal: couldn't read uverbs ABI version.
--------------------------------------------------------------------------
[0,0,0]: OpenIB on host localhost was unable to find any HCAs.
Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------
librdmacm: couldn't read ABI version.
librdmacm: assuming: 4
libibverbs: Fatal: couldn't read uverbs ABI version.
CMA: unable to open /dev/infiniband/rdma_cm

The warnings should not be a problem as it just compalinging about not finding 
a high performance interconnect network options that were compiled in and the 
system will fall back to lower performance defaults.

strace gives a lot of output but shows the program is stalling with:

futex(0x26049c, FUTEX_WAIT, 2, NULL

It looks like some kind of deadlock is occurring probably relating to
the threads involved or shared memory.

I have compiled up a personal version of OpenMPI removing the OpenIB support:

./configure –prefix=/opt/local/openmpi --without-openib

This works (once I worked out that it was linking against the old
libraries at runtime) and the correct output (nothing) is produced.

Regards,
Panos


--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

Reply via email to