To those who care about the openib BTL...

SHORT VERSION
-------------

Do you really want to call ibv_fork_init() in the openib BTL by default?

MORE DETAIL
-----------

Rolf V. pointed out to me yesterday that we're calling ibv_fork_init() in the 
openib BTL.  He asked if we did the same in the usnic BTL.  We don't, and 
here's why:

1. it adds a slight performance penalty for ibv_reg_mr/ibv_dereg_mr
2. the only thing ibv_fork_init() protects against is the child sending from 
memory that it thinks should already be registered:

-----
MPI_Init(...)
if (0 == fork()) {
    ibv_post_send(some_previously_pinned_buffer, ...);
    // ^^ this can't work because the buffer is *not* pinned in the child
    // (for lack of a longer explanation here)
}
-----

3. ibv_fork_init() is not intended to protect against a child invoking an MPI 
function (if they do that; they get what they deserve!).

Note that #2 can't happen, because MPI doesn't expose its protection domains, 
queue pairs, or registrations (or any of its verbs constructs) at all.  

Hence, all ibv_fork_init() does is a) impose a performance penalty, and b) make 
memory physically unavailable in a child process, such that:

----
ibv_fork_init();
a = malloc(...);
a[0] = 17;
ibv_reg_mr(a, ...);
if (0 == fork()) {
    printf("this is a[0]: %d\n", a[0]);
    // ^^ This will segv
}
-----

But the registered memory may actually be useful in the child.

So I just thought I'd pass this along, and ask the openib-caring people of the 
world if you really still want to be calling ibv_fork_init() by default in the 
openib BTL.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Reply via email to