It sounds to me that the issue described is similar to
https://github.com/STEllAR-GROUP/hpx/pull/3948.
Something changed how we retrieve the number of localities (the issue with
alps was discovered on 1.3.0).

Hartmut Kaiser <[email protected]> schrieb am Fr., 5. Juli 2019
18:51:

> Andy,
>
> Could you please list your MPI env variables here? The number of localities
> should be picked up from them before the parcelport is initialized.
> Something goes wrong there, so it assumes that networking should not be
> enabled at all...
>
> Thanks!
> Regards Hartmut
> ---------------
> http://stellar.cct.lsu.edu
> https://github.com/STEllAR-GROUP/hpx
>
>
> > -----Original Message-----
> > From: [email protected] <hpx-users-
> > [email protected]> On Behalf Of Andreas Schäfer
> > Sent: Friday, July 5, 2019 5:14 AM
> > To: [email protected]
> > Subject: Re: [hpx-users] Debugging bad locality number on MPI?
> >
> > Thanks Thomas!
> >
> > The MPI environment variables are set up correctly. I dug a bit deeper
> and
> > figured out that HPX isn't even enabling networking. In
> > hpx/src/util/runtime_configuration.cpp the function
> > enable_networking() only enables networking if one of the following
> > conditions is true:
> >
> > a) the number of localities is > 1
> > b) the node number (is this the rank?) is > 0
> > c) the number of expected localities is != 0
> > d) the runtime mode is not console
> >
> > In my case the values are
> > a) 1
> > b) -1
> > c) 0
> > d) console
> >
> > These seem to be correct, since HPX can't know the number of localities
> > prior to initializing the MPI parcelport.
> > (enable_networking() is run when the parcelhandler is created, but before
> > the parcelports are instantiated.
> >
> > I'll be honest: I don't quite understand the logic there, but if I change
> > the code to enable networking in console mode, or if I enable networking
> > if the node number equals -1, then it works.
> >
> > Looks like the this was introduced with https://github.com/STEllAR-
> > GROUP/hpx/commit/ffb8470e6a1143e9e1c95c39eff58eec322148d3#diff-
> > 86850321552971332a5de979a34bd259
> >
> > Thoughts?
> >
> > Thanks!
> > -Andi
> >
> >
> > On 10:45 Fri 05 Jul     , Thomas Heller wrote:
> > > The relevant parts in the codebase are here:
> > > Setting the env to check via cmake:
> > > https://github.com/STEllAR-GROUP/hpx/blob/master/plugins/parcelport/mp
> > > i/CMakeLists.txt#L42 Detecting if we run inside a MPI environment:
> > > https://github.com/STEllAR-GROUP/hpx/blob/master/plugins/parcelport/mp
> > > i/mpi_environment.cpp#L30-L52
> > >
> > > On Fri, Jul 5, 2019 at 10:41 AM Thomas Heller <[email protected]>
> > wrote:
> > >
> > > > The initialization of the MPI parcelport is done by checking
> > > > environment variables set by the most common MPI implementations.
> > > > Which MPI implementation do you use? Can you attach the output of
> > `mpirun env` maybe?
> > > >
> > > > Andreas Schäfer <[email protected]> schrieb am Fr., 5. Juli 2019 08:58:
> > > >
> > > >>
> > > >> On 01:11 Fri 05 Jul     , Marcin Copik wrote:
> > > >> > I've seen such issue in pure MPI applications. The usual reason
> > > >> > is using an mpirun/mpiexec provided by an implementation
> > > >> > different than the one used for linking. Checking for a mismatch
> > there might help.
> > > >>
> > > >> There is only one MPI version installed on that machine. Also,
> > > >> running a simple MPI hello world works as expected. My assumption
> > > >> is that the MPI parcelport is not initialized correctly. Which part
> > > >> of the code loads/initialized all the parcelports?
> > > >>
> > > >> Thanks
> > > >> -Andi
> > > >>
> > > >>
> > > >> --
> > > >> ==========================================================
> > > >> Andreas Schäfer
> > > >>
> > > >> HPC and Supercomputing for Computer Simulations LibGeoDecomp
> > > >> project lead, http://www.libgeodecomp.org
> > > >>
> > > >> PGP/GPG key via keyserver
> > > >>
> > > >> I'm an SRE @ Google, this is a private account though.
> > > >> All mails are my own and not Google's.
> > > >> ==========================================================
> > > >>
> > > >> (\___/)
> > > >> (+'.'+)
> > > >> (")_(")
> > > >> This is Bunny. Copy and paste Bunny into your signature to help him
> > > >> gain world domination!
> > > >> _______________________________________________
> > > >> hpx-users mailing list
> > > >> [email protected]
> > > >> https://mail.cct.lsu.edu/mailman/listinfo/hpx-users
> > > >>
> > > >
> >
> > > _______________________________________________
> > > hpx-users mailing list
> > > [email protected]
> > > https://mail.cct.lsu.edu/mailman/listinfo/hpx-users
> >
> >
> > --
> > ==========================================================
> > Andreas Schäfer
> >
> > HPC and Supercomputing for Computer Simulations LibGeoDecomp project
> lead,
> > http://www.libgeodecomp.org
> >
> > PGP/GPG key via keyserver
> >
> > I'm an SRE @ Google, this is a private account though.
> > All mails are my own and not Google's.
> > ==========================================================
> >
> > (\___/)
> > (+'.'+)
> > (")_(")
> > This is Bunny. Copy and paste Bunny into your signature to help him gain
> > world domination!
>
>
> _______________________________________________
> hpx-users mailing list
> [email protected]
> https://mail.cct.lsu.edu/mailman/listinfo/hpx-users
>
_______________________________________________
hpx-users mailing list
[email protected]
https://mail.cct.lsu.edu/mailman/listinfo/hpx-users

Reply via email to