Re: [OMPI users] unable to launch a job on a system with OmniPath

2021-05-27 Thread Heinz, Michael William via users
fabric support and would only work via TCP. So any advice on how to accomplish my goal would be appreciated. I realize that performance-wise that is going to be quite... sad. But currently that's not the main concern. Regards, Pavel Mezentsev. On Wed, May 19, 2021 at 5:40 PM Heinz, Michael Willia

Re: [OMPI users] unable to launch a job on a system with OmniPath

2021-05-19 Thread Heinz, Michael William via users
because configure spit out this warning WARNING: PSM2 needs to be version 11.2.173 or later. Disabling MTL The above cluster is running IntelOPA 10.9.2 Tim From: users mailto:users-boun...@lists.open-mpi.org>> on behalf of "Heinz, Michael William via users" mailto:

Re: [OMPI users] unable to launch a job on a system with OmniPath

2021-05-19 Thread Heinz, Michael William via users
After thinking about this for a few more minutes, it occurred to me that you might be able to "fake" the required UUID support by passing it as a shell variable. For example: export OMPI_MCA_orte_precondition_transports="0123456789ABCDEF-0123456789ABCDEF" would probably do it. However, note

Re: [OMPI users] unable to launch a job on a system with OmniPath

2021-05-19 Thread Heinz, Michael William via users
So, the bad news is that the PSM2 MTL requires ORTE - ORTE generates a UUID to identify the job across all nodes in the fabric, allowing processes to find each other over OPA at init time. I believe the reason this works when you use OFI/libfabric is that libfabrice generates its own UUIDs.

Re: [OMPI users] unable to launch a job on a system with OmniPath

2021-05-10 Thread Heinz, Michael William via users
That warning is an annoying bit of cruft from the openib / verbs provider that can be ignored. (Actually, I recommend using "-btl ^openib" to suppress the warning.) That said, there is a known issue with selecting PSM2 and OMPI 4.1.0. I'm not sure that that's the problem you're hitting,

Re: [OMPI users] Building Open-MPI with Intel C

2021-04-07 Thread Heinz, Michael William via users
ou looked at using Easybuild? Would be good to have your input there maybe. On Wed, 7 Apr 2021 at 01:01, Heinz, Michael William via users mailto:users@lists.open-mpi.org>> wrote: I’m having a heck of a time building OMPI with Intel C. Compilation goes fine, installation goes fine, comp

Re: [OMPI users] Building Open-MPI with Intel C

2021-04-07 Thread Heinz, Michael William via users
to the Intel runtime). IIRC, there is also an option in the Intel compiler to statically link to the runtime. Cheers, Gilles On Wed, Apr 7, 2021 at 9:00 AM Heinz, Michael William via users mailto:users@lists.open-mpi.org>> wrote: I’m having a heck of a time building OMPI with Intel C. Com

[OMPI users] Building Open-MPI with Intel C

2021-04-06 Thread Heinz, Michael William via users
I'm having a heck of a time building OMPI with Intel C. Compilation goes fine, installation goes fine, compiling test apps (the OSU benchmarks) goes fine... but when I go to actually run an MPI app I get: [awbp025:~/work/osu-icc](N/A)$ /usr/mpi/icc/openmpi-icc/bin/mpirun -np 2 -H

Re: [OMPI users] Newbie With Issues

2021-03-30 Thread Heinz, Michael William via users
It looks like you're trying to build Open MPI with the Intel C compiler. TBH - I think that icc isn't included with the latest release of oneAPI, I think they've switched to including clang instead. I had a similar issue to yours but I resolved it by installing a 2020 version of the Intel HPC

Re: [OMPI users] Error intialising an OpenFabrics device.

2021-03-13 Thread Heinz, Michael William via users
I’ve begun getting this annoyingly generic warning, too. It appears to be coming from the openib provider. If you disable it with -mtl ^openib the warning goes away. Sent from my iPad > On Mar 13, 2021, at 3:28 PM, Bob Beattie via users > wrote: > > Hi everyone, > > To be honest, as an

Re: [OMPI users] Stable and performant openMPI version for Ubuntu20.04 ?

2021-03-04 Thread Heinz, Michael William via users
What interconnect are you using at run time? That is, are you using Ethernet or InfiniBand or Omnipath? Sent from my iPad On Mar 4, 2021, at 5:05 AM, Raut, S Biplab via users wrote:  [AMD Official Use Only - Internal Distribution Only] After downloading a particular openMPI version, let’s

[OMPI users] Unexpected issue with 4.1.x build

2021-03-02 Thread Heinz, Michael William via users
While testing the recent UCX PR I noticed I was getting this warning: -- WARNING: There was an error initializing an OpenFabrics device. Local host: cn-priv-01 Local device: hfi1_0

Re: [OMPI users] [EXTERNAL] Re: OpenMPI 4.0.5 error with Omni-path

2021-01-28 Thread Heinz, Michael William via users
] [EXTERNAL] Re: OpenMPI 4.0.5 error with Omni-path Patrick, Do you have any PSM2_* or HFI_* environment variables defined in your run time environment that could be affecting things? -Original Message- From: users On Behalf Of Heinz, Michael William via users Sent: Wednesday, January

Re: [OMPI users] [EXTERNAL] Re: OpenMPI 4.0.5 error with Omni-path

2021-01-27 Thread Heinz, Michael William via users
Patrick, Do you have any PSM2_* or HFI_* environment variables defined in your run time environment that could be affecting things? -Original Message- From: users On Behalf Of Heinz, Michael William via users Sent: Wednesday, January 27, 2021 3:37 PM To: Open MPI Users Cc: Heinz

Re: [OMPI users] [EXTERNAL] Re: OpenMPI 4.0.5 error with Omni-path

2021-01-27 Thread Heinz, Michael William via users
Unfortunately, OPA/PSM support for Debian isn't handled by Intel directly or by Cornelis Networks - but I should point out you can download the latest official source for PSM2 and the drivers from Github. -Original Message- From: users On Behalf Of Michael Di Domenico via users Sent:

Re: [OMPI users] OpenMPI 4.0.5 error with Omni-path

2021-01-26 Thread Heinz, Michael William via users
Patrick how are you using original PSM if you’re using Omni-Path hardware? The original PSM was written for QLogic DDR and QDR Infiniband adapters. As far as needing openib - the issue is that the PSM2 MTL doesn’t support a subset of MPI operations that we previously used the pt2pt BTL for. For

Re: [OMPI users] OpenMPI 4.0.5 error with Omni-path

2021-01-25 Thread Heinz, Michael William via users
Patrick, is your application multi-threaded? PSM2 was not originally designed for multiple threads per process. I do know that the OSU alltoallV test does pass when I try it. Sent from my iPad > On Jan 25, 2021, at 12:57 PM, Patrick Begou via users > wrote: > > Hi Howard and Michael, > >

Re: [OMPI users] OpenMPI 4.0.5 error with Omni-path

2021-01-25 Thread Heinz, Michael William via users
What happens if you specify -mtl ofi ? -Original Message- From: users On Behalf Of Patrick Begou via users Sent: Monday, January 25, 2021 12:54 PM To: users@lists.open-mpi.org Cc: Patrick Begou Subject: Re: [OMPI users] OpenMPI 4.0.5 error with Omni-path Hi Howard and Michael, thanks

[OMPI users] OpenMPI 4.0.5 error with Omni-path

2021-01-25 Thread Heinz, Michael William via users
Patrick, You really have to provide us some detailed information if you want assistance. At a minimum we need to know if you're using the PSM2 MTL or the OFI MTL and what the actual error is. Please provide the actual command line you are having problems with, along with any errors. In

Re: [OMPI users] can't open /dev/ipath, network down (err=26)

2020-05-09 Thread Heinz, Michael William via users
That it! I was trying to remember what the setting was but I haven’t worked on those HCAs since around 2012, so it was faint. That said, I found the Intel TrueScale manual online at

[OMPI users] can't open /dev/ipath, network down (err=26)

2020-05-09 Thread Heinz, Michael William via users
Prentice, Avoiding the obvious question of whether your FM is running and the fabric is in an active state, It sounds like your exhausting a resource on the cards. Ralph is correct about support for QLogic cards being long past but I’ll see what I can dig up in the archives on Monday to see if

[OMPI users] Subject: need a tool and its use to verify use of infiniband network

2020-01-16 Thread Heinz, Michael William via users
btl_base_verbose may do what you need. Add it to your mpirun arguments. For example: [LINUX hds1fna2271 20200116_1404 mpi_apps]# /usr/mpi/gcc/openmpi-3.1.6/bin/mpirun -np 2 -map-by node --allow-run-as-root -machinefile /usr/src/opa/mpi_apps/mpi_hosts -mca btl self,openib,vader -mca

Re: [OMPI users] silent failure for large allgather

2019-09-25 Thread Heinz, Michael William via users
Emmanuel Thomé, Thanks for bringing this to our attention. It turns out this issue affects all OFI providers in open-mpi. We've applied a fix to the 3.0.x and later branches of open-mpi/ompi on github. However, you should be aware that this fix simply adds the appropriate error message, it