Hello,

So we now have Slurm 18.08.6-2 compiled with PMIx 3.1.2

then I installed openmpi 4.0.0 with:

--with-slurm  --with-pmix=internal --with-libevent=internal --enable-shared 
--enable-
static  --with-x


(Following the thread, it was mentioned that building OMPI 4.0.0 with PMIx 
3.1.2 will fail with PMIX_MODEX and PMIX_INFO_ARRAY errors, so I used internal 
PMIx)



The MPI program fails with:


*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
[cn603-13-r:387088] Local abort before MPI_INIT completed completed 
successfully, but am not able to aggregate error messages, and not able to 
guarantee that all other processes were killed!


for each process, please advise! what's going wrong here?







All the best,
--
Passant A. Hafez | HPC Applications Specialist
KAUST Supercomputing Core Laboratory (KSL)
King Abdullah University of Science and Technology
Building 1, Al-Khawarizmi, Room 0123
Mobile : +966 (0) 55-247-9568
Mobile : +20 (0) 106-146-9644
Office  : +966 (0) 12-808-0367
________________________________
From: users <users-boun...@lists.open-mpi.org> on behalf of Ralph H Castain 
<r...@open-mpi.org>
Sent: Monday, March 4, 2019 5:29 PM
To: Open MPI Users
Subject: Re: [OMPI users] Building PMIx and Slurm support



On Mar 4, 2019, at 5:34 AM, Daniel Letai 
<d...@letai.org.il<mailto:d...@letai.org.il>> wrote:

Gilles,
On 3/4/19 8:28 AM, Gilles Gouaillardet wrote:
Daniel,


On 3/4/2019 3:18 PM, Daniel Letai wrote:

So unless you have a specific reason not to mix both, you might also give the 
internal PMIx a try.
Does this hold true for libevent too? Configure complains if libevent for 
openmpi is different than the one used for the other tools.


I am not exactly sure of which scenario you are running.

Long story short,

 - If you use an external PMIx, then you have to use an external libevent 
(otherwise configure will fail).

   It must be the same one used by PMIx, but I am not sure configure checks 
that.

- If you use the internal PMIx, then it is up to you. you can either use the 
internal libevent, or an external one.

Thanks, that clarifies the issues I've experienced. Since PMIx doesn't have to 
be the same for server and nodes, I can compile slurm with external PMIx with 
system libevent, and compile openmpi with internal PMIx and libevent, and that 
should work. Is that correct?

Yes - that is indeed correct!


BTW, building 4.0.1rc1 completed successfully using external for all, will 
start testing in near future.

Cheers,


Gilles

Thanks,
Dani_L.
_______________________________________________
users mailing list
users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>
https://lists.open-mpi.org/mailman/listinfo/users

_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Reply via email to