Excellent points Ken; thanks! I expanded the FAQ entry here to include these points:
http://www.open-mpi.org/faq/?category=openfabrics#ofa-fork On Nov 30, 2010, at 9:52 AM, Ken Cain wrote: > Hi Jeff, > > We have had some recent experience with this in an Open MPI 1.4.x version and > thought it would be useful to contribute to the discussion. Please see below. > > Jeff Squyres wrote: >> On Nov 29, 2010, at 6:25 PM, George Bosilca wrote: >>> The main problem is that openib require to pin memory pages in order to >>> take advantage of RMA features. There is a major issues with these pinned >>> pages and fork, leading to segmentation fault in some specific cases. >>> However, we only pin the pages on the MPI calls related to data transfers. >>> Therefore, if you call fork __before__ any other MPI data transfer function >>> (but after MPI_Init as you use the process rank), your application should >>> be safe. >> Note that Open MPI also pins some internal memory during MPI_INIT, but that >> memory is totally internal to libmpi, so you should be safe (i.e., you >> should never be able to find it and therefore never be able to try to touch >> it). > > This is what we believe happened in our testing: > > 1. MPI_init allocated and pinned down some memory. This memory was 64 byte > aligned and not page-aligned to 4096 bytes. So an allocation that ideally > should have resulted in 2 pages being pinned, actually had 3 pages pinned > with lots of unused memory on the 3rd page. > > 2. A child process created via popen tried to allocate some memory (perhaps a > byproduct of popen execution itself) and was allocated memory on that last > page with lots of unused memory. When the child tried to touch the > allocation, there was seg fault. > > We could reduce the probability of this happenning by changing the alignment > of MPI allocations to 4096 bytes. But since MPI allocations are not sized to > be multiple of page size, this isn't a foolproof method. > > One way (agreed not ideal) to avoid the potential seg fault is to set the MCA > parameter btl_openib_want_fork_suppoort = 0. But then you are "trusting" any > child processes to not intentionally or as a result of a bug, touch the > memory regions that have been registered/pinned by the parent. > >>>> How can one be sure that the disabling the warning is ok? Could you please >>>> elaborate on what makes forks vulnerable? May be that will guide the >>>> developers to make an informed decision on whether to disable them or find >>>> another alternative. >>> No way to know at 100%. Now for an elaborate answer: Once upon a time ... >>> The fork story is a long and boring one, we would all have preferred to >>> never heard about it (believe me). A quick and compressed version can be >>> found on the QLogic download page >>> (http://filedownloads.qlogic.com/files/driver/70277/release_QLogicIB-Basic_4400_Rev_A.html). >> That's a good summary. The issue is with OFED itself, not with Open MPI. >> Note, too, that calling popen() should also be safe (even though we'll warn >> about it -- our atfork hook has no way of knowing whether you're calling >> system, popen, or something else). > > Thanks, > > -Ken > -- > Ken Cain > Mercury Computer Systems, Inc. (http://www.mc.com) > > This message is intended only for the designated recipient(s) and may > contain confidential or proprietary information of Mercury Computer > Systems, Inc. This message is solely intended to facilitate business > discussions and does not constitute an express or implied offer to sell > or purchase any products, services, or support. Any commitments must be > made in writing and signed by duly authorized representatives of each > party. > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/