Oh, I see. No, we don't want to add a full modex if there isn't one already.
Now, if we restrict this to the intra-node (we don't care on which
socket/core is a distant process), is there any simple way to do an
intra-node-only modex ?
On 04/26/2016 04:28 PM, Ralph Castain wrote:
On Apr 26, 2016, at 3:35 PM, Sylvain Jeaugey <sjeau...@nvidia.com> wrote:
Indeed, I implied that affinity was set before MPI_Init (usually even before
the process is launched).
And yes, that would require a modex ... but I thought there was one already and
maybe we could pack the affinity information inside the existing one.
If the BTLs et al don’t require the modex, then we don’t perform it (e.g., when
launched by mpirun or via a PMIx-enabled RM). So when someone does as you
describe, then we would have to force the modex to exchange the info. Doable,
but results in a scaling penalty, and so definitely not something we want to do
by default.
On 04/26/2016 02:56 PM, Ralph Castain wrote:
Hmmm…you mean for procs on the same node? I’m not sure how you can do it
without introducing another data exchange, and that would require the app to
execute it since otherwise we have no idea when they set the affinity.
If we assume they set the affinity prior to calling MPI_Init, then we could do
it - but at the cost of forcing a modex. You can only detect your own affinity,
so to get the relative placement, you have to do an exchange if we can’t pass
it to you. Perhaps we could offer it as an option?
On Apr 26, 2016, at 2:27 PM, Sylvain Jeaugey <sjeau...@nvidia.com> wrote:
Within the BTL code (and surely elsewhere), we can use those convenient
OPAL_PROC_ON_LOCAL_{NODE,SOCKET, ...} macros to figure out where another
endpoint is located compared to us.
The problem is that it only works when ORTE defines it. The NODE works almost
always since ORTE is always doing it. But for the NUMA, SOCKET, or CORE to
work, we need to use Open MPI binding/mapping capabilities. If the process
affinity was set with something else (custom scripts using taskset, cpusets,
...), it doesn't work.
How hard do you think it would it be to detect the affinity and set those flags
using hwloc to figure out if we're on the same {SOCKET, CORE, ...} ? Where
would it be simpler to do this ?
Thanks.
Sylvain
-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may
contain
confidential information. Any unauthorized review, use, disclosure or
distribution
is prohibited. If you are not the intended recipient, please contact the
sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------
_______________________________________________
devel mailing list
de...@open-mpi.org
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post:
http://www.open-mpi.org/community/lists/devel/2016/04/18821.php
_______________________________________________
devel mailing list
de...@open-mpi.org
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post:
http://www.open-mpi.org/community/lists/devel/2016/04/18822.php
_______________________________________________
devel mailing list
de...@open-mpi.org
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post:
http://www.open-mpi.org/community/lists/devel/2016/04/18823.php
_______________________________________________
devel mailing list
de...@open-mpi.org
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post:
http://www.open-mpi.org/community/lists/devel/2016/04/18824.php