HI Folks,

I think this is a bug in the PSM MTL add_procs.  The call to psm_ep_connect
needs to be taking previously connected ep's into account,
much like what is done in the libfabric psm provider code.

Howard


2014-11-12 3:12 GMT-07:00 Rainer Keller <rainer.kel...@hft-stuttgart.de>:

> Dear Andrew,
> no, this is not done with dynamically connecting jobs.
>
> The failing tests use a communicator, which is setup by merging back an
> intercommunicator (MPI_Intercomm_merge), which was first split from
> MPI_COMM_WORLD (MPI_Intercomm_create).
>
> Please see tst_comm.c:459
>
> Best regards,
> Rainer
>
>
>
>
> On 11.11.2014, at 23:44, "Friedley, Andrew" <andrew.fried...@intel.com>
> wrote:
>
> > Ralph,
> >
> > You're right that PSM wouldn't support dynamically connecting jobs.  I
> don't think intercomm_create implies that though.  For example you could
> split COMM_WORLD's group into two groups, then create an intercommunicator
> across those two groups.  I'm guessing that's what this test is doing, I'd
> have to go read the code to be sure though.
> >
> > I verified this tests works over PSM and OMPI 1.6.5; it fails on 1.8.1
> and 1.8.3.
> >
> > Andrew
> >
> >> -----Original Message-----
> >> From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Ralph
> >> Castain
> >> Sent: Tuesday, November 11, 2014 2:23 PM
> >> To: Open MPI Developers
> >> Subject: Re: [OMPI devel] 1.8.3 and PSM errors
> >>
> >> I thought PSM didn’t support dynamic operations such as Intercomm_create
> >> - yes? The PSM security key wouldn’t match between the two jobs, and so
> >> there is no way for them to communicate.
> >>
> >> Which is why I thought PSM can’t be used for dynamic operations at all,
> >> including comm_spawn and connect/accept
> >>
> >>
> >>> On Nov 11, 2014, at 2:13 PM, Jeff Squyres (jsquyres) <
> jsquy...@cisco.com>
> >> wrote:
> >>>
> >>> On Nov 11, 2014, at 4:56 PM, Friedley, Andrew
> >> <andrew.fried...@intel.com> wrote:
> >>>
> >>>> OK, I'm able to reproduce this now, not sure why I couldn't before.
> I took
> >> a look at the diff of the PSM MTL from 1.6.5 to 1.8.1, and nothing is
> standing
> >> out to me.
> >>>>
> >>>> Question more for the general group:  Did anything related to the
> >> behavior/usage of MTL add_procs() change in this time window?
> >>>
> >>> The time between the 1.6.x series and the 1.8.x series is measure in
> terms
> >> of a year or two, so, ya, something might have changed...
> >>>
> >>>> More particularly, it looks like add_procs is being called a second
> time
> >> during MPI_Intercomm_create and being passed a process that is already
> >> connected (passed into the first add_procs call).  Is that right?
> Should the
> >> MTL handle multiple add_procs calls with the same proc provided?
> >>>
> >>> I'm afraid I don't know much about the MTL interface.
> >>>
> >>> George / Nathan?
> >>>
> >>> --
> >>> Jeff Squyres
> >>> jsquy...@cisco.com
> >>> For corporate legal information go to:
> >> http://www.cisco.com/web/about/doing_business/legal/cri/
> >>>
> >>> _______________________________________________
> >>> devel mailing list
> >>> de...@open-mpi.org
> >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >>> Link to this post: http://www.open-
> >> mpi.org/community/lists/devel/2014/11/16294.php
> >>
> >> _______________________________________________
> >> devel mailing list
> >> de...@open-mpi.org
> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >> Link to this post: http://www.open-
> >> mpi.org/community/lists/devel/2014/11/16295.php
> > _______________________________________________
> > devel mailing list
> > de...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/11/16296.php
>
> ---------------------------------------------------------------------
> Prof. Dr.-Ing. Rainer Keller
> Hochschule für Technik Stuttgart
> Fakultät für Vermessung, Informatik und Mathematik
> Schellingstr. 24, Raum 2/449
> 70174 Stuttgart
> T.: +49 (0)711 8926-2812
> F.: +49 (0)711 8926-2553
>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/11/16299.php
>

Reply via email to