Hi folks,

Question about PMIx in the 2.x tree: on 1.x I used to be able to start N 
individual jobs through mpirun with -np1 and have them gradually join a single 
intercommunicator through MPI_Comm_accept, MPI_Comm_connect, 
MPI_Intercomm_create, and MPI_Intercomm_merge. The port that one of the 
processes would listen on included its IP address and others would connect to 
that. I tried porting this code to the 2.x tree and found the port is now just 
an integer. Reading up on the changelogs and commit history, I found PMIx 
replaced DPM starting with 2.x. Reading up on PMIx and OpenMPI, my 
understanding is that OpenMPI ships with a PMIx server implementation, and that 
all processes in the job have to be connected to this PMIx server at start. It 
looks like MPI_Comm_accept and MPI_Comm_connect communicate through k/v pairs 
in the PMIx server.

This means it's no longer possible to start jobs through multiple mpirun 
executions and then join them into a single intercommunicator at runtime. Is my 
understanding correct?

Thank you,

Pieter
_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

Reply via email to