To those who care about the openib BTL...
SHORT VERSION
-------------
Do you really want to call ibv_fork_init() in the openib BTL by default?
MORE DETAIL
-----------
Rolf V. pointed out to me yesterday that we're calling ibv_fork_init() in the
openib BTL. He asked if we did the same in the usnic BTL. We don't, and
here's why:
1. it adds a slight performance penalty for ibv_reg_mr/ibv_dereg_mr
2. the only thing ibv_fork_init() protects against is the child sending from
memory that it thinks should already be registered:
-----
MPI_Init(...)
if (0 == fork()) {
ibv_post_send(some_previously_pinned_buffer, ...);
// ^^ this can't work because the buffer is *not* pinned in the child
// (for lack of a longer explanation here)
}
-----
3. ibv_fork_init() is not intended to protect against a child invoking an MPI
function (if they do that; they get what they deserve!).
Note that #2 can't happen, because MPI doesn't expose its protection domains,
queue pairs, or registrations (or any of its verbs constructs) at all.
Hence, all ibv_fork_init() does is a) impose a performance penalty, and b) make
memory physically unavailable in a child process, such that:
----
ibv_fork_init();
a = malloc(...);
a[0] = 17;
ibv_reg_mr(a, ...);
if (0 == fork()) {
printf("this is a[0]: %d\n", a[0]);
// ^^ This will segv
}
-----
But the registered memory may actually be useful in the child.
So I just thought I'd pass this along, and ask the openib-caring people of the
world if you really still want to be calling ibv_fork_init() by default in the
openib BTL.
--
Jeff Squyres
[email protected]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/