On Tuesday 02 August 2011 19:29, Shamis, Pavel wrote: > XRC domain is created by process that starts first. All the rest processes, > that belong > to the same mpi session and reside on the same node, join the domain. > TGT QP is created by process that receive inbound connection first and it is > not necessary > the same process that created the domain. Even so we assume that both > processes belong to > the same domain, and belong to the same mpi session. > The only things that are important here are: 1. Before the TGT QP creator exits (de-allocating its domain), there is at least one other process active which has opened the same domain (so that the domain, and the TGT QP are not de-allocated when the creator exits, which would clobber the calculation).
Note that this condition probably exists already in MPI -- if the creator had the only domain reference, then the domain would be de-allocated when the creator exited, and the calculation would not work anyway. 2. When the job is finished, all processes have de-allocated the XRC domain -- so that the domain gets de-allocated and all its TGT QPs destroyed. (i.e., the domain's lifetime is the job). If these 2 conditions are met, there is absolutely no justification for TGT QP reference counting. The domain reference count is good enough -- when the domain reference count goes to zero, the domain is de-allocated and all its TGT QPs destroyed. Things only get complicated when the domain-allocator process allocates a single domain and simply uses that single domain for all jobs (i.e., the domain is never de-allocated for the lifetime of the allocating process, and the allocating process is the server for all jobs). -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
