This is part of the challenge of HPC: there are general solutions, but no specific silver bullet that works in all scenarios. In short: everyone's setup is different. So we can offer advice, but not necessarily a 100%-guaranteed solution that will work in your environment.
In general, we advise users to: * Configure/build Open MPI with their favorite compilers (either a proprietary compiler or modern/recent GCC or clang). More recent compilers tend to give better performance than older compilers. * Configure/build Open MPI against the communication library for your HPC interconnect. These days, it's mostly Libfabric or UCX. If you have InfiniBand, it's UCX. That's probably table stakes right there; you can tweak more beyond that, but with those 2 things, you'll go pretty far. FWIW, we did a series of talks about Open MPI in conjunction with the EasyBuild community recently. The videos and slides are available here: * https://www.open-mpi.org/video/?category=general#abcs-of-open-mpi-part-1 * https://www.open-mpi.org/video/?category=general#abcs-of-open-mpi-part-2 * https://www.open-mpi.org/video/?category=general#abcs-of-open-mpi-part-3 For a beginner, parts 1 and 2 are probably the most relevant, and you can probably skip the parts about PMIx (circle back to that later for more advanced knowledge). -- Jeff Squyres jsquy...@cisco.com ________________________________________ From: users <users-boun...@lists.open-mpi.org> on behalf of Diego Zuccato via users <users@lists.open-mpi.org> Sent: Thursday, January 27, 2022 2:59 AM To: users@lists.open-mpi.org Cc: Diego Zuccato Subject: Re: [OMPI users] RES: OpenMPI - Intel MPI Sorry for the noob question, but: what should I configure for OpenMPI "to perform on the host cluster"? Any link to a guide would be welcome! Slightly extended rationale for the question: I'm currently using "unconfigured" Debian packages and getting some strange behaviour... Maybe it's just something that a little tuning can fix easily. Il 27/01/2022 07:58, Ralph Castain via users ha scritto: > I'll disagree a bit there. You do want to use an MPI library in your > container that is configued to perform on the host cluster. However, > that doesn't mean you are constrained as Gilles describes. It takes a > little more setup knowledge, true, but there are lots of instructions > and knowledgeable people out there to help. Experiments have shown that > using non-system MPIs provide at least equivalent performance to the > native MPIs when configured. Matching the internal/external MPI > implementations may simplify the mechanics of setting it up, but it is > definitely not required. > > >> On Jan 26, 2022, at 8:55 PM, Gilles Gouaillardet via users >> <users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>> wrote: >> >> Brian, >> >> FWIW >> >> Keep in mind that when running a container on a supercomputer, it is >> generally recommended to use the supercomputer MPI implementation >> (fine tuned and with support for the high speed interconnect) instead >> of the one of the container (generally a vanilla MPI with basic >> support for TCP and shared memory). >> That scenario implies several additional constraints, and one of them >> is the MPI library of the host and the container are (oversimplified) >> ABI compatible. >> >> In your case, you would have to rebuild your container with MPICH >> (instead of Open MPI) so it can be "substituted" at run time with >> Intel MPI (MPICH based and ABI compatible). >> >> Cheers, >> >> Gilles >> >> On Thu, Jan 27, 2022 at 1:07 PM Brian Dobbins via users >> <users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>> wrote: >> >> >> Hi Ralph, >> >> Thanks for the explanation - in hindsight, that makes perfect >> sense, since each process is operating inside the container and >> will of course load up identical libraries, so data types/sizes >> can't be inconsistent. I don't know why I didn't realize that >> before. I imagine the past issues I'd experienced were just due >> to the PMI differences in the different MPI implementations at the >> time. I owe you a beer or something at the next in-person SC >> conference! >> >> Cheers, >> - Brian >> >> >> On Wed, Jan 26, 2022 at 4:54 PM Ralph Castain via users >> <users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>> wrote: >> >> There is indeed an ABI difference. However, the _launcher_ >> doesn't have anything to do with the MPI library. All that is >> needed is a launcher that can provide the key exchange >> required to wireup the MPI processes. At this point, both >> MPICH and OMPI have PMIx support, so you can use the same >> launcher for both. IMPI does not, and so the IMPI launcher >> will only support PMI-1 or PMI-2 (I forget which one). >> >> You can, however, work around that problem. For example, if >> the host system is using Slurm, then you could "srun" the >> containers and let Slurm perform the wireup. Again, you'd have >> to ensure that OMPI was built to support whatever wireup >> protocol the Slurm installation supported (which might well be >> PMIx today). Also works on Cray/ALPS. Completely bypasses the >> IMPI issue. >> >> Another option I've seen used is to have the host system start >> the containers (using ssh or whatever), providing the >> containers with access to a "hostfile" identifying the TCP >> address of each container. It is then easy for OMPI's mpirun >> to launch the job across the containers. I use this every day >> on my machine (using Docker Desktop with Docker containers, >> but the container tech is irrelevant here) to test OMPI. >> Pretty easy to set that up, and I should think the sys admins >> could do so for their users. >> >> Finally, you could always install the PMIx Reference RTE >> (PRRTE) on the cluster as that executes at user level, and >> then use PRRTE to launch your OMPI containers. OMPI runs very >> well under PRRTE - in fact, PRRTE is the RTE embedded in OMPI >> starting with the v5.0 release. >> >> Regardless of your choice of method, the presence of IMPI >> doesn't preclude using OMPI containers so long as the OMPI >> library is fully contained in that container. Choice of launch >> method just depends on how your system is setup. >> >> Ralph >> >> >>> On Jan 26, 2022, at 3:17 PM, Brian Dobbins >>> <bdobb...@gmail.com <mailto:bdobb...@gmail.com>> wrote: >>> >>> >>> Hi Ralph, >>> >>> Afraid I don't understand. If your image has the OMPI >>> libraries installed in it, what difference does it make >>> what is on your host? You'll never see the IMPI installation. >>> >>> >>> We have been supporting people running that way since >>> Singularity was originally released, without any >>> problems. The only time you can hit an issue is if you >>> try to mount the MPI libraries from the host (i.e., >>> violate the container boundary) - so don't do that and >>> you should be fine. >>> >>> >>> Can you clarify what you mean here? I thought there was an >>> ABI difference between the various MPICH-based MPIs and >>> OpenMPI, meaning you can't use a host's Intel MPI to launch a >>> container's OpenMPI-compiled program. You /can/ use the >>> internal-to-the-container OpenMPI to launch everything, which >>> is easy for single-node runs but more challenging for >>> multi-node ones. Maybe my understanding is wrong or out of >>> date though? >>> >>> Thanks, >>> - Brian >>> >>> >>> >>>> On Jan 26, 2022, at 12:19 PM, Luis Alfredo Pires Barbosa >>>> <luis_pire...@hotmail.com >>>> <mailto:luis_pire...@hotmail.com>> wrote: >>>> >>>> Hi Ralph, >>>> __ __ >>>> My singularity image has OpenMPI, but my host doesnt >>>> (Intel MPI). And I am not sure if I the system would >>>> work with Intel + OpenMPI. >>>> __ __ >>>> Luis >>>> __ __ >>>> Enviado doEmail >>>> <https://go.microsoft.com/fwlink/?LinkId=550986>para Windows >>>> __ __ >>>> *De:*Ralph Castain via users >>>> <mailto:users@lists.open-mpi.org> >>>> *Enviado:*quarta-feira, 26 de janeiro de 2022 16:01 >>>> *Para:*Open MPI Users <mailto:users@lists.open-mpi.org> >>>> *Cc:*Ralph Castain <mailto:r...@open-mpi.org> >>>> *Assunto:*Re: [OMPI users] OpenMPI - Intel MPI >>>> __ __ >>>> Err...the whole point of a container is to put all the >>>> library dependencies _inside_ it. So why don't you just >>>> install OMPI in your singularity image?____ >>>> __ __ >>>> >>>> >>>> ____ >>>> >>>> On Jan 26, 2022, at 6:42 AM, Luis Alfredo Pires >>>> Barbosa via users <users@lists.open-mpi.org >>>> <mailto:users@lists.open-mpi.org>> wrote:____ >>>> __ __ >>>> Hello all,____ >>>> __ __ >>>> I have Intel MPI in my cluster but I am >>>> runningsingularity imageof a software which uses >>>> OpenMPI.____ >>>> __ __ >>>> Since they may not be compatible and I dont think it >>>> is possible to get these two different MPI running >>>> in the system.____ >>>> I wounder if there is some work arround for this >>>> issue.____ >>>> __ __ >>>> Any insight would be welcome.____ >>>> Luis >>> >> > -- Diego Zuccato DIFA - Dip. di Fisica e Astronomia Servizi Informatici Alma Mater Studiorum - Università di Bologna V.le Berti-Pichat 6/2 - 40127 Bologna - Italy tel.: +39 051 20 95786