How do you support MPI dynamic processes over MXM?

  George.

On Mon, Mar 2, 2015 at 12:43 PM, Alexander Mikheev <al...@mellanox.com>
wrote:

> Mxm needs that barrier. Otherwise some ranks may hung trying to close mxm
> connections
>
> > -----Original Message-----
> > From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Ralph
> > Castain
> > Sent: Monday, March 02, 2015 5:05 PM
> > To: de...@open-mpi.org
> > Subject: Re: [OMPI devel] [OMPI commits] Git: open-mpi/ompi branch
> > master updated. dev-1195-gfbb7c80
> >
> > It’s your code, so you are welcome to do this if you want. I’ll just
> point out
> > that his is a really big hit in total execution time at scale as it will
> be done in
> > addition to the barrier already performed in MPI_Finalize
> >
> > So you are going to do _two_ barriers during shutdown.
> >
> >
> > > On Mar 2, 2015, at 5:43 AM, git...@crest.iu.edu wrote:
> > >
> > > This is an automated email from the git hooks/post-receive script. It
> > > was generated because a ref change was pushed to the repository
> > > containing the project "open-mpi/ompi".
> > >
> > > The branch, master has been updated
> > >       via  fbb7c80312cbcd823346e89a56f5d83e8620c57c (commit)
> > >       via  168c83ed9592120fd2199e8280b517ab0060e136 (commit)
> > >      from  42f5a36ee3f1e400aa251804725b86192c9df9fa (commit)
> > >
> > > Those revisions listed above that are new to this repository have not
> > > appeared on any other notification email; so we list those revisions
> > > in full, below.
> > >
> > > - Log
> > > -----------------------------------------------------------------
> > > https://github.com/open-
> > mpi/ompi/commit/fbb7c80312cbcd823346e89a56f5d8
> > > 3e8620c57c
> > >
> > > commit fbb7c80312cbcd823346e89a56f5d83e8620c57c
> > > Merge: 42f5a36 168c83e
> > > Author: Mike Dubman <mi...@mellanox.com>
> > > Date:   Mon Mar 2 15:43:32 2015 +0200
> > >
> > >    Merge pull request #439 from alex-mikheev/topic/mxm_finalize_fix
> > >
> > >    OMPI/MXM: add out of band barrier at the end of del_procs
> > >
> > >
> > >
> > > https://github.com/open-
> > mpi/ompi/commit/168c83ed9592120fd2199e8280b517
> > > ab0060e136
> > >
> > > commit 168c83ed9592120fd2199e8280b517ab0060e136
> > > Author: Alex Mikheev <al...@mellanox.com>
> > > Date:   Mon Mar 2 12:56:02 2015 +0200
> > >
> > >    OMPI/MXM: add out of band barrier at the end of del_procs
> > >
> > >    mxm shutdown requires out of band barrier
> > >
> > > diff --git a/ompi/mca/mtl/mxm/mtl_mxm.c
> > b/ompi/mca/mtl/mxm/mtl_mxm.c
> > > index 1a4e21a..ed4089a 100644
> > > --- a/ompi/mca/mtl/mxm/mtl_mxm.c
> > > +++ b/ompi/mca/mtl/mxm/mtl_mxm.c
> > > @@ -617,6 +617,7 @@ int ompi_mtl_mxm_del_procs(struct
> > mca_mtl_base_module_t *mtl, size_t nprocs,
> > >             OBJ_RELEASE(endpoint);
> > >         }
> > >     }
> > > +    opal_pmix.fence(NULL, 0);
> > >     return OMPI_SUCCESS;
> > > }
> > >
> > > diff --git a/ompi/mca/pml/yalla/pml_yalla.c
> > > b/ompi/mca/pml/yalla/pml_yalla.c index 2cfa6ca..d53cb7c 100644
> > > --- a/ompi/mca/pml/yalla/pml_yalla.c
> > > +++ b/ompi/mca/pml/yalla/pml_yalla.c
> > > @@ -240,6 +240,7 @@ int mca_pml_yalla_del_procs(struct ompi_proc_t
> > **procs, size_t nprocs)
> > >         PML_YALLA_VERBOSE(2, "disconnected from rank %ld", procs[i]-
> > >super.proc_name);
> > >         procs[i]->proc_endpoints[OMPI_PROC_ENDPOINT_TAG_PML] = NULL;
> > >     }
> > > +    opal_pmix.fence(NULL, 0);
> > >     return OMPI_SUCCESS;
> > > }
> > >
> > >
> > >
> > > ----------------------------------------------------------------------
> > > -
> > >
> > > Summary of changes:
> > > ompi/mca/mtl/mxm/mtl_mxm.c     | 1 +
> > > ompi/mca/pml/yalla/pml_yalla.c | 1 +
> > > 2 files changed, 2 insertions(+)
> > >
> > >
> > > hooks/post-receive
> > > --
> > > open-mpi/ompi
> > > _______________________________________________
> > > ompi-commits mailing list
> > > ompi-comm...@open-mpi.org
> > > http://www.open-mpi.org/mailman/listinfo.cgi/ompi-commits
> >
> > _______________________________________________
> > devel mailing list
> > de...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > Searchable archives: http://www.open-
> > mpi.org/community/lists/devel/2015/03/index.php
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/03/17080.php

Reply via email to