Thanks Andrew, I was getting confused with the libfabric psm provider code
inside open mpi.

2015-03-03 9:35 GMT-07:00 Friedley, Andrew <andrew.fried...@intel.com>:

>  Hi Howard,
>
>
>
> The PSM MTL sets PSM_EP_OPEN_AFFINITY_SKIP, so if I understand right, OMPI
> already has the fix for you.
>
>
>
> Andrew
>
>
>
> *From:* devel [mailto:devel-boun...@open-mpi.org] *On Behalf Of *Howard
> Pritchard
> *Sent:* Tuesday, March 3, 2015 8:21 AM
> *To:* Open MPI Developers List
> *Subject:* [OMPI devel] psm and process affinity in open mpi
>
>
>
> Hi Folks,
>
>
>
> First initial disclaimer - I've looked through the open mpi faq and have
> been unable
>
> so far an answer to my question below.
>
>
>
> I've been having a discussion with one of the other trilab folks about
> some issues with
>
> using PSM within mvpaich where the default cpu affinity behavior of PSM
> can cause problems.
>
> It turns out that the default behavior of PSM appears to be to set cpu
> affinity for a process
>
> which calls psm_ep_open if process affinity has not already been set.
> We're finding that
>
> it is necesary to use the PSM_EP_OPEN_AFFINITY_SKIP setting in the
> affinity field
>
> of the psm_opts struct that is passed to psm_ep_open in order to work
> around the problem.
>
>
>
> The problem has to do with singleton processes.  If mvapich is using psm
> and multiple
>
> singleton jobs are scheduled on a node, they all by default end up binding
> to core 0.
>
> Setting the above option eliminates this problem.
>
>
>
> Could Open MPI also potentially have this same problem?  If so, I'd want
> to add an mca param
>
> to set this option before calling psm_ep_open within psm mtl.  Hmm.. maybe
> the ofi mtl
>
> supporter should talk with the libfabric psm provider folks about this.
>
>
>
> Thanks for any help,
>
>
>
> Howard
>
>
>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/03/17088.php
>

Reply via email to