(cross-post to 'users' and 'devel' mailing lists)

Dear Open MPI developer,
a long time ago, I reported about an error in Open MPI:
http://www.open-mpi.org/community/lists/users/2012/02/18565.php

Well, in the 1.6 the behaviour has changed: the test case don't hang forever and block an InfiniBand interface, but seem to run through, and now this error message is printed:
--------------------------------------------------------------------------
The OpenFabrics (openib) BTL failed to register memory in the driver.
Please check /var/log/messages or dmesg for driver specific failure
reason.
The failure occured here:

  Local host:    mlx4_0
  Device:        openib_reg_mr
  Function:      Cannot allocate memory()
  Errno says:

You may need to consult with your system administrator to get this
problem fixed.
--------------------------------------------------------------------------



Looking into FAQ
http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages
deliver us no hint about what is bad. The locked memory is unlimited:
--------------------------------------------------------------------------
pk224850@linuxbdc02:~[502]$ cat /etc/security/limits.conf | grep memlock
#        - memlock - max locked-in-memory address space (KB)
*               hard    memlock         unlimited
*               soft    memlock         unlimited
--------------------------------------------------------------------------


Could it still be an Open MPI issue? Are you interested in reproduce this?

Best,
Paul Kapinos

P.S: The same test with Intel MPI cannot run using DAPL, but run very fine opef 'ofa' (= native verbs as Open MPI use it). So I believe the problem is rooted in the communication pattern of the program; it send very LARGE messages to a lot of/all other processes. (The program perform an matrix transposition of a distributed matrix).

--
Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
RWTH Aachen University, Center for Computing and Communication
Seffenter Weg 23,  D 52074  Aachen (Germany)
Tel: +49 241/80-24915

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to