Hi Jeff,

We have had some recent experience with this in an Open MPI 1.4.x version and thought it would be useful to contribute to the discussion. Please see below.

Jeff Squyres wrote:
On Nov 29, 2010, at 6:25 PM, George Bosilca wrote:

The main problem is that openib require to pin memory pages in order to take 
advantage of RMA features. There is a major issues with these pinned pages and 
fork, leading to segmentation fault in some specific cases. However, we only 
pin the pages on the MPI calls related to data transfers. Therefore, if you 
call fork __before__ any other MPI data transfer function (but after MPI_Init 
as you use the process rank), your application should be safe.

Note that Open MPI also pins some internal memory during MPI_INIT, but that 
memory is totally internal to libmpi, so you should be safe (i.e., you should 
never be able to find it and therefore never be able to try to touch it).

This is what we believe happened in our testing:

1. MPI_init allocated and pinned down some memory. This memory was 64 byte aligned and not page-aligned to 4096 bytes. So an allocation that ideally should have resulted in 2 pages being pinned, actually had 3 pages pinned with lots of unused memory on the 3rd page.

2. A child process created via popen tried to allocate some memory (perhaps a byproduct of popen execution itself) and was allocated memory on that last page with lots of unused memory. When the child tried to touch the allocation, there was seg fault.

We could reduce the probability of this happenning by changing the alignment of MPI allocations to 4096 bytes. But since MPI allocations are not sized to be multiple of page size, this isn't a foolproof method.

One way (agreed not ideal) to avoid the potential seg fault is to set the MCA parameter btl_openib_want_fork_suppoort = 0. But then you are "trusting" any child processes to not intentionally or as a result of a bug, touch the memory regions that have been registered/pinned by the parent.


How can one be sure that the disabling the warning is ok? Could you please 
elaborate on what makes forks vulnerable? May be that will guide the developers 
to make an informed decision on whether to disable them or find another 
alternative.
No way to know at 100%. Now for an elaborate answer: Once upon a time ... The 
fork story is a long and boring one, we would all have preferred to never heard 
about it (believe me). A quick and compressed version can be found on the 
QLogic download page 
(http://filedownloads.qlogic.com/files/driver/70277/release_QLogicIB-Basic_4400_Rev_A.html).

That's a good summary.  The issue is with OFED itself, not with Open MPI.

Note, too, that calling popen() should also be safe (even though we'll warn 
about it -- our atfork hook has no way of knowing whether you're calling 
system, popen, or something else).


Thanks,

-Ken
--
Ken Cain
Mercury Computer Systems, Inc. (http://www.mc.com)

This message is intended only for the designated recipient(s) and may
contain confidential or proprietary information of Mercury Computer
Systems, Inc. This message is solely intended to facilitate business
discussions and does not constitute an express or implied offer to sell
or purchase any products, services, or support. Any commitments must be
made in writing and signed by duly authorized representatives of each
party.

Reply via email to