Re: [OMPI users] RES: OpenMPI - Intel MPI

Gilles Gouaillardet via users Wed, 26 Jan 2022 20:59:26 -0800

Brian,

FWIW


Keep in mind that when running a container on a supercomputer, it is
generally recommended to use the supercomputer MPI implementation
(fine tuned and with support for the high speed interconnect) instead of
the one of the container (generally a vanilla MPI with basic
support for TCP and shared memory).
That scenario implies several additional constraints, and one of them is
the MPI library of the host and the container are (oversimplified) ABI
compatible.

In your case, you would have to rebuild your container with MPICH (instead
of Open MPI) so it can be "substituted" at run time with Intel MPI (MPICH
based and ABI compatible).

Cheers,

Gilles

On Thu, Jan 27, 2022 at 1:07 PM Brian Dobbins via users <
users@lists.open-mpi.org> wrote:

>
> Hi Ralph,
>
>   Thanks for the explanation - in hindsight, that makes perfect sense,
> since each process is operating inside the container and will of course
> load up identical libraries, so data types/sizes can't be inconsistent.  I
> don't know why I didn't realize that before.  I imagine the past issues I'd
> experienced were just due to the PMI differences in the different MPI
> implementations at the time.  I owe you a beer or something at the next
> in-person SC conference!
>
>   Cheers,
>   - Brian
>
>
> On Wed, Jan 26, 2022 at 4:54 PM Ralph Castain via users <
> users@lists.open-mpi.org> wrote:
>
>> There is indeed an ABI difference. However, the _launcher_ doesn't have
>> anything to do with the MPI library. All that is needed is a launcher that
>> can provide the key exchange required to wireup the MPI processes. At this
>> point, both MPICH and OMPI have PMIx support, so you can use the same
>> launcher for both. IMPI does not, and so the IMPI launcher will only
>> support PMI-1 or PMI-2 (I forget which one).
>>
>> You can, however, work around that problem. For example, if the host
>> system is using Slurm, then you could "srun" the containers and let Slurm
>> perform the wireup. Again, you'd have to ensure that OMPI was built to
>> support whatever wireup protocol the Slurm installation supported (which
>> might well be PMIx today). Also works on Cray/ALPS. Completely bypasses the
>> IMPI issue.
>>
>> Another option I've seen used is to have the host system start the
>> containers (using ssh or whatever), providing the containers with access to
>> a "hostfile" identifying the TCP address of each container. It is then easy
>> for OMPI's mpirun to launch the job across the containers. I use this every
>> day on my machine (using Docker Desktop with Docker containers, but the
>> container tech is irrelevant here) to test OMPI. Pretty easy to set that
>> up, and I should think the sys admins could do so for their users.
>>
>> Finally, you could always install the PMIx Reference RTE (PRRTE) on the
>> cluster as that executes at user level, and then use PRRTE to launch your
>> OMPI containers. OMPI runs very well under PRRTE - in fact, PRRTE is the
>> RTE embedded in OMPI starting with the v5.0 release.
>>
>> Regardless of your choice of method, the presence of IMPI doesn't
>> preclude using OMPI containers so long as the OMPI library is fully
>> contained in that container. Choice of launch method just depends on how
>> your system is setup.
>>
>> Ralph
>>
>>
>> On Jan 26, 2022, at 3:17 PM, Brian Dobbins <bdobb...@gmail.com> wrote:
>>
>>
>> Hi Ralph,
>>
>> Afraid I don't understand. If your image has the OMPI libraries installed
>>> in it, what difference does it make what is on your host? You'll never see
>>> the IMPI installation.
>>>
>>
>>> We have been supporting people running that way since Singularity was
>>> originally released, without any problems. The only time you can hit an
>>> issue is if you try to mount the MPI libraries from the host (i.e., violate
>>> the container boundary) - so don't do that and you should be fine.
>>>
>>
>>   Can you clarify what you mean here?  I thought there was an ABI
>> difference between the various MPICH-based MPIs and OpenMPI, meaning you
>> can't use a host's Intel MPI to launch a container's OpenMPI-compiled
>> program.  You *can* use the internal-to-the-container OpenMPI to launch
>> everything, which is easy for single-node runs but more challenging for
>> multi-node ones.  Maybe my understanding is wrong or out of date though?
>>
>>   Thanks,
>>   - Brian
>>
>>
>>
>>>
>>>
>>> On Jan 26, 2022, at 12:19 PM, Luis Alfredo Pires Barbosa <
>>> luis_pire...@hotmail.com> wrote:
>>>
>>> Hi Ralph,
>>>
>>> My singularity image has OpenMPI, but my host doesnt (Intel MPI). And I
>>> am not sure if I the system would work with Intel + OpenMPI.
>>>
>>> Luis
>>>
>>> Enviado do Email <https://go.microsoft.com/fwlink/?LinkId=550986>
>>> para Windows
>>>
>>> *De: *Ralph Castain via users <users@lists.open-mpi.org>
>>> *Enviado:*quarta-feira, 26 de janeiro de 2022 16:01
>>> *Para: *Open MPI Users <users@lists.open-mpi.org>
>>> *Cc:*Ralph Castain <r...@open-mpi.org>
>>> *Assunto: *Re: [OMPI users] OpenMPI - Intel MPI
>>>
>>> Err...the whole point of a container is to put all the library
>>> dependencies _inside_ it. So why don't you just install OMPI in your
>>> singularity image?
>>>
>>>
>>>
>>> On Jan 26, 2022, at 6:42 AM, Luis Alfredo Pires Barbosa via users <
>>> users@lists.open-mpi.org> wrote:
>>>
>>> Hello all,
>>>
>>> I have Intel MPI in my cluster but I am running singularity image of a
>>> software which uses OpenMPI.
>>>
>>> Since they may not be compatible and I dont think it is possible to get
>>> these two different MPI running in the system.
>>> I wounder if there is some work arround for this issue.
>>>
>>> Any insight would be welcome.
>>> Luis
>>>
>>>
>>>
>>

Re: [OMPI users] RES: OpenMPI - Intel MPI

Reply via email to