from:"Gilles Gouaillardet"

Re: [OMPI users] "MCW rank 0 is not bound (or bound to all available processors)" when running multiple jobs concurrently

2024-04-15 Thread Gilles Gouaillardet via users

Greg,

If Open MPI was built with UCX, your jobs will likely use UCX (and the
shared memory provider) even if running on a single node.
You can
mpirun --mca pml ob1 --mca btl self,sm ...
if you want to avoid using UCX.

What is a typical mpirun command line used under the hood by your "make
test"?
Though the warning might be ignored, SIGILL is definitely an issue.
I encourage you to have your app dump a core in order to figure out where
this is coming from


Cheers,

Gilles

On Tue, Apr 16, 2024 at 5:20 AM Greg Samonds via users <
users@lists.open-mpi.org> wrote:

> Hello,
>
>
>
> We’re running into issues with jobs failing in a non-deterministic way
> when running multiple jobs concurrently within a “make test” framework.
>
>
>
> Make test is launched from within a shell script running inside a Podman
> container, and we’re typically running with “-j 20” and “-np 4” (20 jobs
> concurrently with 4 procs each).  We’ve also tried reducing the number of
> jobs to no avail.  Each time the battery of test cases is run, about 2 to 4
> different jobs out of around 200 fail with the following errors:
>
>
>
>
> *[podman-ci-rocky-8.8:03528] MCW rank 1 is not bound (or bound to all
> available processors) [podman-ci-rocky-8.8:03540] MCW rank 3 is not bound
> (or bound to all available processors) [podman-ci-rocky-8.8:03519] MCW rank
> 0 is not bound (or bound to all available processors)
> [podman-ci-rocky-8.8:03533] MCW rank 2 is not bound (or bound to all
> available processors) *
>
> *Program received signal SIGILL: Illegal instruction.*
>
> Some info about our setup:
>
>- Ampere Altra 80 core ARM machine
>- Open MPI 4.1.7a1 from HPC-X v2.18
>- Rocky Linux 8.6 host, Rocky Linux 8.8 container
>- Podman 4.4.1
>- This machine has a Mellanox Connect X-6 Lx NIC, however we’re
>avoiding the Mellanox software stack by running in a container, and these
>are single node jobs only
>
>
>
> We tried passing “—bind-to none” to the running jobs, and while this
> seemed to reduce the number of failing jobs on average, it didn’t eliminate
> the issue.
>
>
>
> We also encounter the following warning:
>
>
>
> *[1712927028.412063] [**podman-ci-rocky-8:3519 :0]sock.c:514
> UCX  WARN  unable to read somaxconn value from /proc/sys/net/core/somaxconn
> file*
>
>
>
> …however as far as I can tell this is probably unrelated and occurs
> because the associated file isn’t accessible inside the container, and
> after checking the UCX source I can see that SOMAXCONN is picked up from
> the system headers anyway.
>
>
>
> If anyone has hints about how to workaround this issue we’d greatly
> appreciate it!
>
>
>
> Thanks,
>
> Greg
>

Re: [OMPI users] Subject: Clarification about mpirun behavior in Slurm jobs

2024-02-24 Thread Gilles Gouaillardet via users

Christopher,

I do not think Open MPI explicitly asks SLURM which cores have been
assigned on each node.
So if you are planning to run multiple jobs on the same node, your best bet
is probably to have SLURM
use cpusets.

Cheers,

Gilles

On Sat, Feb 24, 2024 at 7:25 AM Christopher Daley via users <
users@lists.open-mpi.org> wrote:

> Dear Support,
>
> I'm seeking clarification about the expected behavior of mpirun in Slurm
> jobs.
>
> Our setup consists of using Slurm for resource allocation and OpenMPI
> mpirun to launch MPI applications. We have found that when two Slurm jobs
> have been allocated different cores on the same compute node that the MPI
> ranks in Slurm job 1 map to the same cores as Slurm job 2. It appears that
> OpenMPI mpirun is not considering the details of the Slurm allocation. We
> get expected behavior when srun is employed as the MPI launcher instead of
> mpirun, i.e. the MPI ranks in Slurm job 1 use different cores than the MPI
> ranks in Slurm job 2.
>
> We have observed this with OpenMPI-4.1.6 and OpenMPI-5.0.2. Should we
> expect that the mpirun in each job will only use the exact cores that were
> allocated by Slurm?
>
> Thanks,
> Chris
>

Re: [OMPI users] Seg error when using v5.0.1

2024-01-31 Thread Gilles Gouaillardet via users

Hi,

please open an issue on GitHub at https://github.com/open-mpi/ompi/issues
and provide the requested information.

If the compilation failed when configured with --enable-debug, please share
the logs.

the name of the WRF subroutine suggests the crash might occur in
MPI_Comm_split(),
if so, are you able to craft a reproducer that causes the crash?

How many nodes and MPI tasks are needed in order to evidence the crash?


Cheers,

Gilles

On Wed, Jan 31, 2024 at 10:09 PM afernandez via users <
users@lists.open-mpi.org> wrote:

> Hello Joseph,
> Sorry for the delay but I didn't know if I was missing something yesterday
> evening and wanted to double check everything this morning. This is for WRF
> but other apps exhibit the same behavior.
> * I had no problem with the serial version (and gdb obviously didn't
> report any issue).
> * I tried compiling with the --enable-debug flag but it was generating
> errors during the compilation and never completed.
> * I went back to my standard flags for debugging: -g -fbacktrace -ggdb
> -fcheck=bounds,do,mem,pointer -ffpe-trap=invalid,zero,overflow. WRF is
> still crashing with little extra info vs yesterday:
> *Backtrace for this error:*
> *#0  0x7f5a4e54451f in ???*
> *at ./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0*
> *#1  0x7f5a4e5a73fe in __GI___libc_free*
> *at ./malloc/malloc.c:3368*
> *#2  0x7f5a4c7aa5c3 in ???*
> *#3  0x7f5a4e83b048 in ???*
> *#4  0x7f5a4e7d3ef1 in ???*
> *#5  0x7f5a4e8dab7b in ???*
> *#6  0x8f6bbf in __module_dm_MOD_split_communicator*
> *at /home/ubuntu/WRF-4.5.2/frame/module_dm.f90:5734*
> *#7  0x1879ebd in init_modules_*
> *at /home/ubuntu/WRF-4.5.2/share/init_modules.f90:63*
> *#8  0x406fe4 in __module_wrf_top_MOD_wrf_init*
> *at ../main/module_wrf_top.f90:130*
> *#9  0x405ff3 in wrf*
> *at /home/ubuntu/WRF-4.5.2/main/wrf.f90:22*
> *#10  0x40605c in main*
> *at /home/ubuntu/WRF-4.5.2/main/wrf.f90:6*
>
> *--*
> *Primary job  terminated normally, but 1 process returned*
> *a non-zero exit code. Per user-direction, the job has been aborted.*
>
> *--*
>
> *--*
> *mpirun noticed that process rank 0 with PID 0 on node ip-172-31-31-163
> exited on signal 11 (Segmentation fault).*
>
> *--*
> Any pointers on what might be going on here as this never happened with
> OMPIv4. Thanks.
>
>
>
> Joseph Schuchart via users wrote:
>
>
> Hello,
>
> This looks like memory corruption. Do you have more details on what your
> app is doing? I don't see any MPI calls inside the call stack. Could you
> rebuild Open MPI with debug information enabled (by adding `--enable-debug`
> to configure)? If this error occurs on singleton runs (1 process) then you
> can easily attach gdb to it to get a better stack trace. Also, valgrind may
> help pin down the problem by telling you which memory block is being free'd
> here.
>
> Thanks
> Joseph
>
> On 1/30/24 07:41, afernandez via users wrote:
>
> quote class="gmail_quote" type="cite" style="margin:0 0 0
> .8ex;border-left:1px #ccc solid;padding-left:1ex">
> Hello,
> I upgraded one of the systems to v5.0.1 and have compiled everything >
> exactly as dozens of previous times with v4. I wasn't expecting any > issue
> (and the compilations didn't report anything out of ordinary) > but running
> several apps has resulted in error messages such as:
> /Backtrace for this error:/
> /#0  0x7f7c9571f51f in ???/
> /at ./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0/
> /#1  0x7f7c957823fe in __GI___libc_free/
> /at ./malloc/malloc.c:3368/
> /#2  0x7f7c93a635c3 in ???/
> /#3  0x7f7c95f84048 in ???/
> /#4  0x7f7c95f1cef1 in ???/
> /#5  0x7f7c95e34b7b in ???/
> /#6  0x6e05be in ???/
> /#7  0x6e58d7 in ???/
> /#8  0x405d2c in ???/
> /#9  0x7f7c95706d8f in __libc_start_call_main/
> /at ../sysdeps/nptl/libc_start_call_main.h:58/
> /#10  0x7f7c95706e3f in __libc_start_main_impl/
> /at ../csu/libc-start.c:392/
> /#11  0x405d64 in ???/
> /#12  0x in ???/
> OS is Ubuntu 22.04, OpenMPI was built with GCC13.2, and before > building
> OpenMPI, I had previously built the hwloc (2.10.0) library at >
> /usr/lib/x86_64-linux-gnu. Maybe I'm missing something pretty basic, > but
> the problem seems to be related to memory allocation.
> Thanks.
>
>
>
>
>

Re: [OMPI users] OpenMPI 5.0.1 Installation Failure

2024-01-26 Thread Gilles Gouaillardet via users

Hi,

Please open a GitHub issue at https://github.com/open-ompi/ompi/issues and
provide the requested information

Cheers,

Gilles

On Sat, Jan 27, 2024 at 12:04 PM Kook Jin Noh via users <
users@lists.open-mpi.org> wrote:

> Hi,
>
>
>
> I’m installing OpenMPI 5.0.1 on Archlinux 6.7.1. Everything goes well till:
>
>
>
> Making check in datatype
>
> make[2]: Entering directory
> '/home/vorlket/build/openmpi-ucx/src/openmpi-5.0.1/test/datatype'
>
> make  opal_datatype_test unpack_hetero checksum position
> position_noncontig ddt_test ddt_raw ddt_raw2 unpack_ooo ddt_pack external32
> large_data partial to_self reduce_local
>
> make[3]: Entering directory
> '/home/vorlket/build/openmpi-ucx/src/openmpi-5.0.1/test/datatype'
>
>   CCLD opal_datatype_test
>
>   CCLD unpack_hetero
>
>   CCLD checksum
>
>   CCLD position
>
>   CCLD position_noncontig
>
>   CCLD ddt_test
>
>   CCLD ddt_raw
>
>   CCLD ddt_raw2
>
>   CCLD unpack_ooo
>
>   CCLD ddt_pack
>
>   CCLD external32
>
>   CCLD large_data
>
>   CCLD partial
>
>   CCLD to_self
>
>   CCLD reduce_local
>
> make[3]: Leaving directory
> '/home/vorlket/build/openmpi-ucx/src/openmpi-5.0.1/test/datatype'
>
> make  check-TESTS
>
> make[3]: Entering directory
> '/home/vorlket/build/openmpi-ucx/src/openmpi-5.0.1/test/datatype'
>
> make[4]: Entering directory
> '/home/vorlket/build/openmpi-ucx/src/openmpi-5.0.1/test/datatype'
>
> ../../config/test-driver: line 112: 1380808 Segmentation fault  (core
> dumped) "$@" >> "$log_file" 2>&1
>
> FAIL: opal_datatype_test
>
> PASS: unpack_hetero
>
> ../../config/test-driver: line 112: 1380857 Segmentation fault  (core
> dumped) "$@" >> "$log_file" 2>&1
>
> FAIL: checksum
>
> ../../config/test-driver: line 112: 1380884 Segmentation fault  (core
> dumped) "$@" >> "$log_file" 2>&1
>
> FAIL: position
>
> ../../config/test-driver: line 112: 1380916 Segmentation fault  (core
> dumped) "$@" >> "$log_file" 2>&1
>
> FAIL: position_noncontig
>
> ../../config/test-driver: line 112: 1380944 Segmentation fault  (core
> dumped) "$@" >> "$log_file" 2>&1
>
> FAIL: ddt_test
>
> ../../config/test-driver: line 112: 1380975 Segmentation fault  (core
> dumped) "$@" >> "$log_file" 2>&1
>
> FAIL: ddt_raw
>
> PASS: ddt_raw2
>
> PASS: unpack_ooo
>
> ../../config/test-driver: line 112: 1381044 Segmentation fault  (core
> dumped) "$@" >> "$log_file" 2>&1
>
> FAIL: ddt_pack
>
> ../../config/test-driver: line 112: 1381070 Segmentation fault  (core
> dumped) "$@" >> "$log_file" 2>&1
>
> FAIL: external32
>
> PASS: large_data
>
> ../../config/test-driver: line 112: 1381120 Segmentation fault  (core
> dumped) "$@" >> "$log_file" 2>&1
>
> FAIL: partial
>
>
> 
>
> Testsuite summary for Open MPI 5.0.1
>
>
> 
>
> # TOTAL: 13
>
> # PASS:  4
>
> # SKIP:  0
>
> # XFAIL: 0
>
> # FAIL:  9
>
> # XPASS: 0
>
> # ERROR: 0
>
>
> 
>
> See test/datatype/test-suite.log
>
> Please report to https://www.open-mpi.org/community/help/
>
>
> 
>
> make[4]: *** [Makefile:2012: test-suite.log] Error 1
>
> make[4]: Leaving directory
> '/home/vorlket/build/openmpi-ucx/src/openmpi-5.0.1/test/datatype'
>
> make[3]: *** [Makefile:2120: check-TESTS] Error 2
>
> make[3]: Leaving directory
> '/home/vorlket/build/openmpi-ucx/src/openmpi-5.0.1/test/datatype'
>
> make[2]: *** [Makefile:2277: check-am] Error 2
>
> make[2]: Leaving directory
> '/home/vorlket/build/openmpi-ucx/src/openmpi-5.0.1/test/datatype'
>
> make[1]: *** [Makefile:1416: check-recursive] Error 1
>
> make[1]: Leaving directory
> '/home/vorlket/build/openmpi-ucx/src/openmpi-5.0.1/test'
>
> make: *** [Makefile:1533: check-recursive] Error 1
>
> make: Leaving directory '/home/vorlket/build/openmpi-ucx/src/openmpi-5.0.1'
>
> ==> ERROR: A failure occurred in check().
>
> Aborting...
>

Re: [OMPI users] Binding to thread 0

2023-09-08 Thread Gilles Gouaillardet via users

Luis,

you can pass the --bind-to hwthread option in order to bind on the first
thread of each core


Cheers,

Gilles

On Fri, Sep 8, 2023 at 8:30 PM Luis Cebamanos via users <
users@lists.open-mpi.org> wrote:

> Hello,
>
> Up to now, I have been using numerous ways of binding with wrappers
> (numactl, taskset) whenever I wanted to play with core placing. Another way
> I have been using is via -rankfile, however I notice that some ranks jump
> from thread 0 to thread 1 on SMT chips. I can control this with numactl for
> instance, but it would be great to see similar behaviour when using
> -rankfile. Is there a way to pack all ranks to one of the threads of each
> core (preferibly to thread 0) so I can nicely see all ranks with htop on
> either left or right of the screen?
>
> The command I am using is pretty simple:
>
> mpirun -np $MPIRANKS --rankfile ./myrankfile
>
> and ./myrankfile looks like
>
> rank 33=argon slot=33
> rank 34=argon slot=34
> rank 35=argon slot=35
> rank 36=argon slot=36
>
> Thanks!
>

Re: [OMPI users] MPI_Init_thread error

2023-07-25 Thread Gilles Gouaillardet via users

Aziz,

When using direct run (e.g. srun), OpenMPI has to interact with SLURM.
This is typically achieved via PMI2 or PMIx

You can
srun --mpi=list
to list the available options on your system

if PMIx is available, you can
srun --mpi=pmix ...

if only PMI2 is available, you need to make sure Open MPI was built with
SLURM support (e.g. configure --with-slurm ...)
and then
srun --mpi=pmi2 ...


Cheers,

Gilles

On Tue, Jul 25, 2023 at 5:07 PM Aziz Ogutlu via users <
users@lists.open-mpi.org> wrote:

> Hi there all,
> We're using Slurm 21.08 on Redhat 7.9 HPC cluster with OpenMPI 4.0.3 + gcc
> 8.5.0.
> When we run command below for call SU2, we get an error message:
>
> *$ srun -p defq --nodes=1 --ntasks-per-node=1 --time=01:00:00 --pty bash
> -i*
> *$ module load su2/7.5.1*
> *$ SU2_CFD config.cfg*
>
>  An error occurred in MPI_Init_thread*
>  on a NULL communicator*
>  MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,*
> and potentially your MPI job)*
> *[cnode003.hpc:17534] Local abort before MPI_INIT completed completed
> successfully, but am not able to aggregate error messages, and not able to
> guarantee that all other processes were killed!*
>
> --
> Best regards,
> Aziz Öğütlü
>
> Eduline Bilişim Sanayi ve Ticaret Ltd. Şti.  www.eduline.com.tr
> Merkez Mah. Ayazma Cad. No:37 Papirus Plaza
> Kat:6 Ofis No:118 Kağıthane -  İstanbul - Türkiye 34406
> Tel : +90 212 324 60 61 Cep: +90 541 350 40 72
>
>

Re: [OMPI users] libnuma.so error

2023-07-19 Thread Gilles Gouaillardet via users

Luis,

That can happen if a component is linked with libnuma.so:
Open MPI will fail to open it and try to fallback on an other one.

You can run ldd on the mca_*.so components in the /.../lib/openmpi directory
to figure out which is using libnuma.so and assess if it is needed or not.

Cheers,

Gilles

On Wed, Jul 19, 2023 at 11:36 PM Luis Cebamanos via users <
users@lists.open-mpi.org> wrote:

> Hello,
>
> I was wondering if anyone has ever seen the following runtime error:
>
> mpirun -np 32 ./hello
> .
> [LOG_CAT_SBGP] libnuma.so: cannot open shared object file: No such file
> or directory
> [LOG_CAT_SBGP] Failed to dlopen libnuma.so. Fallback to GROUP_BY_SOCKET
> manual.
> .
>
> The funny thing is that the binary is executed despite the errors.
> What could be causing it?
>
> Regards,
> Lusi
>

Re: [OMPI users] OpenMPI crashes with TCP connection error

2023-06-16 Thread Gilles Gouaillardet via users

Kurt,


I think Joachim was also asking for the command line used to launch your
application.

Since you are using Slurm and MPI_Comm_spawn(), it is important to
understand whether you are using mpirun or srun

FWIW, --mpi=pmix is a srun option. you can srun --mpi=list to find the
available options.


Cheers,

Gilles

On Sat, Jun 17, 2023 at 2:53 AM Mccall, Kurt E. (MSFC-EV41) via users <
users@lists.open-mpi.org> wrote:

> Joachim,
>
>
>
> Sorry to make you resort to divination.   My sbatch command is as follows:
>
>
>
> sbatch --ntasks-per-node=24 --nodes=16 --ntasks=384  --job-name $job_name
> --exclusive --no-kill --verbose $release_dir/script.bash &
>
>
>
> --mpi=pmix isn’t an option recognized by sbatch.   Is there an
> alternative?   The slurm doc you mentioned has the following paragraph.  Is
> it still true with OpenMpi 4.1.5?
>
>
>
> “*NOTE*: OpenMPI has a limitation that does not support calls to
> *MPI_Comm_spawn()* from within a Slurm allocation. If you need to use the *
> MPI_Comm_spawn()* function you will need to use another MPI
> implementation combined with PMI-2 since PMIx doesn't support it either.”
>
>
>
> I use MPI_Comm_spawn extensively in my application.
>
>
>
> Thanks,
>
> Kurt
>
>
>
>
>
> *From:* Jenke, Joachim 
> *Sent:* Thursday, June 15, 2023 5:33 PM
> *To:* Open MPI Users 
> *Cc:* Mccall, Kurt E. (MSFC-EV41) 
> *Subject:* [EXTERNAL] Re: OpenMPI crashes with TCP connection error
>
>
>
> CAUTION*:* This email originated from outside of NASA.  Please take care
> when clicking links or opening attachments.  Use the "Report Message"
> button to report suspicious messages to the NASA SOC.
>
>
>
> Hi Kurt,
>
>
>
> Without knowing your exact MPI launch command, my cristal orb thinks you
> might want to try the -mpi=pmix flag for srun as documented for
> slurm+openmpi:
>
> https://slurm.schedmd.com/mpi_guide.html#open_mpi
>
>
>
> -Joachim
> --
>
> *From:* users  on behalf of Mccall,
> Kurt E. (MSFC-EV41) via users 
> *Sent:* Thursday, June 15, 2023 11:56:28 PM
> *To:* users@lists.open-mpi.org 
> *Cc:* Mccall, Kurt E. (MSFC-EV41) 
> *Subject:* [OMPI users] OpenMPI crashes with TCP connection error
>
>
>
> My job immediately crashes with the error message below.   I don’t know
> where to begin looking for the cause
>
> of the error, or what information to provide to help you understand it.
> Maybe you could clue me in .
>
>
>
> I am on RedHat 4.18.0, using Slurm 20.11.8 and OpenMPI 4.1.5 compiled with
> gcc 8.5.0.
>
> I built OpenMPI with the following  “configure” command:
>
>
>
> ./configure --prefix=/opt/openmpi/4.1.5_gnu --with-slurm --enable-debug
>
>
>
>
>
>
>
> WARNING: Open MPI accepted a TCP connection from what appears to be a
>
> another Open MPI process but cannot find a corresponding process
>
> entry for that peer.
>
>
>
> This attempted connection will be ignored; your MPI job may or may not
>
> continue properly.
>
>
>
>   Local host: n001
>
>   PID:985481
>
>
>
>
>

Re: [OMPI users] Issue with Running MPI Job on CentOS 7

2023-05-31 Thread Gilles Gouaillardet via users

Open MPI 1.6.5 is an antique version and you should not expect any support
with it.
Instead, I suggest you try the latest one, rebuild your app and try again.

FWIW, that kind of error occurs when the MPI library does not match mpirun
That can happen when mpirun and libmpi.so come from different vendors
and/or very different versions.


Cheers,

Gilles

On Thu, Jun 1, 2023 at 10:27 AM 深空探测 via users 
wrote:

> Hi all,
>
> I am writing to seek assistance regarding an issue I encountered while
> running an MPI job on CentOS 7  virtual machine  .
>
> To provide some context, I have successfully installed Open MPI version
> 1.6.5 on my CentOS 7 system. However, when I attempted to run the command
> "mpirun -n 2 -H wude,wude mpispeed 1000 10s 1", where "wude" is my
> hostname, I encountered unexpected results. Instead of running with two
> processes as intended, it appears that only one process was executed. The
> output I received is as follows:
>
> Processor = wude
> Rank = 0/1
> Sorry, must run with an even number of processes
> This program should be invoked in a manner similar to:
> mpirun -H host1,host2,...,hostN mpispeed [|s]
> []
> where
> numSends: number of blocks to send (e.g., 256), or
> timeSend: duration in seconds to send (e.g., 100s)
> Processor = wude
> Rank = 0/1
> Sorry, must run with an even number of processes
> This program should be invoked in a manner similar to:
> mpirun -H host1,host2,...,hostN mpispeed [|s]
> []
> where
> numSends: number of blocks to send (e.g., 256), or
> timeSend: duration in seconds to send (e.g., 100s)
> --
> mpirun noticed that the job aborted, but has no info as to the process
> that caused that situation.
> --
>
> I am unsure about the source of this problem and would appreciate any
> guidance or insights you can provide to help me resolve it. It seems that
> there may be an issue with the process distribution or the command syntax.
>
> I would be grateful if you could review the information provided and
> suggest any possible solutions or troubleshooting steps that I can
> undertake to rectify the problem.
>
> Thank you for your attention to this matter. I look forward to hearing
> from you soon.
>
> Best regards,
>
> De Wu
>

Re: [OMPI users] psec warning when launching with srun

2023-05-20 Thread Gilles Gouaillardet via users

Christof,

Open MPI switching to the internal PMIx is a bug I addressed in
https://github.com/open-mpi/ompi/pull/11704

Feel free to manually download and apply the patch, you will then need
recent autotools and run
./autogen.pl --force

An other option is to manually edit the configure file

Look for the following snippet

   # Final check - if they didn't point us explicitly at an
external version

   # but we found one anyway, use the internal version if it is
higher

   if test "$opal_external_pmix_version" != "internal" && (test -z
"$with_pmix" || test "$with_pmix" = "yes")

then :

  if test "$opal_external_pmix_version" != "3x"


and replace the last line with

  if test $opal_external_pmix_version_major -lt 3


Cheers,

Gilles

On Sat, May 20, 2023 at 6:13 PM christof.koehler--- via users <
users@lists.open-mpi.org> wrote:

> Hello Z. Matthias Krawutschke,
>
> On Fri, May 19, 2023 at 09:08:08PM +0200, Zhéxué M. Krawutschke wrote:
> > Hello Christoph,
> > what exactly is your problem with OpenMPI and Slurm?
> > Do you compile the products yourself? Which LINUX distribution and
> version are you using?
> >
> > If you compile the software yourself, could you please tell me what the
> "configure" command looks like and which MUNGE version is in use? From the
> distribution or compiled by yourself?
> >
> > I would be very happy to take on this topic and help you. You can also
> reach me at +49 176 67270992.
> > Best regards from Berlin
>
> please refer to (especially the end) of my first mail in this thread
> which is available here
> https://www.mail-archive.com/users@lists.open-mpi.org/msg35141.html
>
> I believe this contains the relevant information you are requesting. The
> second mail which you are replying to was just additional information.
> My apologies if this led to confusion.
>
> Please let me know if any relevant information is missing from my first
> email. At the bottom of this email I include the ompi_info output as
> further addendum.
>
> To summarize: I would like to understand where the munge warning
> and PMIx error described in the first email (and the github link
> included) come from. The explanation in the github issue
> does not appear to be correct as all munge libraries are
> available everywhere. To me, it appears at the moment that OpenMPIs
> configure decides erroneously to build and use the internal pmix
> instead of using the (presumably) newer externally available PMIx,
> leading to launcher problems with srun.
>
>
> Best Regards
>
> Christof
>
>  Package: Open MPI root@admin.service Distribution
> Open MPI: 4.1.5
>   Open MPI repo revision: v4.1.5
>Open MPI release date: Feb 23, 2023
> Open RTE: 4.1.5
>   Open RTE repo revision: v4.1.5
>Open RTE release date: Feb 23, 2023
> OPAL: 4.1.5
>   OPAL repo revision: v4.1.5
>OPAL release date: Feb 23, 2023
>  MPI API: 3.1.0
> Ident string: 4.1.5
>   Prefix: /cluster/mpi/openmpi/4.1.5/gcc-11.3.1
>  Configured architecture: x86_64-pc-linux-gnu
>   Configure host: admin.service
>Configured by: root
>Configured on: Wed May 17 18:45:42 UTC 2023
>   Configure host: admin.service
>   Configure command line: '--enable-mpi1-compatibility'
> '--enable-orterun-prefix-by-default'
> '--with-ofi=/cluster/libraries/libfabric/1.18.0/' '--with-slurm'
> '--with-pmix' '--with-pmix-libdir=/usr/lib64' '--with-pmi'
> '--with-pmi-libdir=/usr/lib64'
> '--prefix=/cluster/mpi/openmpi/4.1.5/gcc-11.3.1'
> Built by: root
> Built on: Wed May 17 06:48:36 PM UTC 2023
>   Built host: admin.service
>   C bindings: yes
> C++ bindings: no
>  Fort mpif.h: yes (all)
> Fort use mpi: yes (full: ignore TKR)
>Fort use mpi size: deprecated-ompi-info-value
> Fort use mpi_f08: yes
>  Fort mpi_f08 compliance: The mpi_f08 module is available, but due to
> limitations in the gfortran compiler and/or Open MPI, does not support
> the following: array subsections, direct passthru (where possible) to
> underlying Open MPI's C functionality
>   Fort mpi_f08 subarrays: no
>Java bindings: no
>   Wrapper compiler rpath: runpath
>   C compiler: gcc
>  C compiler absolute: /usr/bin/gcc
>   C compiler family name: GNU
>   C compiler version: 11.3.1
> C++ compiler: g++
>C++ compiler absolute: /usr/bin/g++
>Fort compiler: gfortran
>Fort compiler abs: /usr/bin/gfortran
>  Fort ignore TKR: yes (!GCC$ ATTRIBUTES NO_ARG_CHECK ::)
>Fort 08 assumed shape: yes
>   Fort optional args: yes
>   Fort INTERFACE: yes
> Fort ISO_FORTRAN_ENV: yes
>Fort STORAGE_SIZE: yes
>   Fort BIND(C) (all): yes
>   Fort ISO_C_BINDING: yes
>  Fort SUBROUTINE BIND(C): yes
>Fort TYPE,BIND(C): yes
>

Re: [OMPI users] Issue with unexpected IP address in OpenMPI

2023-03-27 Thread Gilles Gouaillardet via users


Todd,


Similar issues were also reported when there is Network Translation 
(NAT) between hosts, and that occured when using kvm/qemu virtual 
machine running on the same host.



First you need to list the available interfaces on both nodes. Then try 
to restrict to a single interface that is known to be working


(no firewall and no NAT)

(e.g. mpirun --mca btl_tcp_if_include eth0 --mca oob_tcp_if_include eth0 
...)



If that does not help make sure there is no NAT:

on the first node, run

nc -v -l 1234

then on the other node, run

nc  1234


If you go back to the first node, you should see the expected ip of the 
second node.


If not, there is NAT somewhere and that does not fly well with Open MPI


Cheers,


Gilles


On 3/28/2023 8:53 AM, Todd Spencer via users wrote:


OpenMPI Users,

I hope this email finds you all well. I am writing to bring to your 
attention an issue that I have encountered while using OpenMPI.


I received the following error message while running a job:

"Open MPI detected an inbound MPI TCP connection request from a peer 
that appears to be part of this MPI job (i.e., it identified itself as 
part of this Open MPI job), but it is from an IP address that is 
unexpected. This is highly unusual. The inbound connection has been 
dropped, and the peer should simply try again with a different IP 
interface (i.e., the job should hopefully be able to continue).


Local host: node02 Local PID: 17805 Peer hostname: node01 
([[23078,1],2]) Source IP of socket: 192.168.0.3 Known IPs of peer: 
192.168.0.225"


I have tried to troubleshoot the issue but to no avail. As a new user 
to this subject, I am not sure what could be causing this issue. I did 
try forcing the nodes to talk to each other using eth0 using the "-mca 
btl_tcp_if_include eth0" command but it did not work.


I found a GitHub thread  
from 2018 that discussed the issue, but since I am new to this, a lot 
of the subject matter went over my head. Could you please advise on 
what could be causing this issue and how to resolve it? If you need 
any additional information, I would be happy to provide it.


Thank you in advance for your help.

Best regards,

Todd

Re: [OMPI users] What is the best choice of pml and btl for intranode communication

2023-03-05 Thread Gilles Gouaillardet via users

Arun,

First Open MPI selects a pml for **all** the MPI tasks (for example,
pml/ucx or pml/ob1)

Then, if pml/ob1 ends up being selected, a btl component (e.g. btl/uct,
btl/vader) is used for each pair of MPI tasks
(tasks on the same node will use btl/vader, tasks on different nodes will
use btl/uct)

Note that if UCX is available, pml/ucx takes the highest priority, so no
btl is involved
(in your case, if means intra-node communications will be handled by UCX
and not btl/vader).
You can force ob1 and try different combinations of btl with
mpirun --mca pml ob1 --mca btl self,, ...

I expect pml/ucx is faster than pml/ob1 with btl/uct for inter node
communications.

I have not benchmarked Open MPI for a while and it is possible btl/vader
outperforms pml/ucx for intra nodes communications,
so if you run on a small number of Infiniband interconnected nodes with a
large number of tasks per node, you might be able
to get the best performances by forcing pml/ob1.

Bottom line, I think it is best for you to benchmark your application and
pick the combination that leads to the best performances,
and you are more than welcome to share your conclusions.

Cheers,

Gilles

On Mon, Mar 6, 2023 at 3:12 PM Chandran, Arun via users <
users@lists.open-mpi.org> wrote:

> [Public]
>
> Hi Folks,
>
> I can run benchmarks and find the pml+btl (ob1, ucx, uct, vader, etc)
> combination that gives the best performance,
> but I wanted to hear from the community about what is generally used in
> "__high_core_count_intra_node_" cases before jumping into conclusions.
>
> As I am a newcomer to openMPI I don't want to end up using a combination
> only because it fared better in a benchmark (overfitting?)
>
> Or the choice of pml+btl for the 'intranode' case is not so important as
> openmpi is mainly used in 'internode' and the 'networking-equipment'
> decides the pml+btl? (UCX for IB)
>
> --Arun
>
> -Original Message-
> From: users  On Behalf Of Chandran,
> Arun via users
> Sent: Thursday, March 2, 2023 4:01 PM
> To: users@lists.open-mpi.org
> Cc: Chandran, Arun 
> Subject: [OMPI users] What is the best choice of pml and btl for intranode
> communication
>
> Hi Folks,
>
> As the number of cores in a socket is keep on increasing, the right
> pml,btl (ucx, ob1, uct, vader, etc) that gives the best performance in
> "intra-node" scenario is important.
>
> For openmpi-4.1.4, which pml, btl combination is the best for intra-node
> communication in the case of higher core count scenario? (p-to-p as well as
> coll) and why?
> Does the answer for the above question holds good for the upcoming ompi5
> release?
>
> --Arun
>

Re: [OMPI users] Open MPI 4.0.3 outside as well as inside a SimpleFOAM container: step creation temporarily disabled, retrying Requested nodes are busy

2023-02-28 Thread Gilles Gouaillardet via users


Rob,


Do you invoke mpirun from **inside** the container?

IIRC, mpirun is generally invoked from **outside** the container, could 
you try this if not already the case?



The error message is from SLURM, so this is really a SLURM vs 
singularity issue.


What if you

srun -N 2 -n 2 hostname

instead of

mpirun ...


Cheers,


Gilles

On 3/1/2023 12:44 PM, Rob Kudyba via users wrote:
Singularity 3.5.3 on RHEL 7 cluster w/ OpenMPI 4.0.3 lives inside a 
SimpleFOAM version 10 container. I've confirmed the OpenMPI versions 
are the same. Perhaps this is a question for Singularity users as well 
but how can I troubleshoot why mpirun just returns step creation 
temporarily disabled, retrying Requested


Singularity> mpirun -V
mpirun (Open MPI) 4.0.3
Report bugs to http://www.open-mpi.org/community/help/
Singularity> which mpirun
/usr/bin/mpirun
Singularity>

$ mpirun -V
mpirun (Open MPI) 4.0.3

mpirun -n 2 -mca plm_base_verbose 100 --mca ras_base_verbose 100 --mca 
rss_base_verbose 100 --mca rmaps_base_verbose 100  singularity exec 
openfoam simpleFoam -fileHandler uncollated -parallel | tee log.simpleFoam
openfoam10/          openfoam10.sif openfoamtestfile.sh 
 openfoam_v2012.sif
[myuser@node047 motorBike]$ mpirun -n 2 -mca plm_base_verbose 100 
--mca ras_base_verbose 100 --mca rss_base_verbose 100 --mca 
rmaps_base_verbose 100  singularity exec openfoam   simpleFoam 
-fileHandler uncollated -parallel | tee log.simpleFoam
openfoam10/          openfoam10.sif openfoamtestfile.sh 
 openfoam_v2012.sif
[myuser@node047 motorBike]$ mpirun -n 2 -mca plm_base_verbose 100 
--mca ras_base_verbose 100 --mca rss_base_verbose 100 --mca 
rmaps_base_verbose 100  singularity exec openfoam10.sif   simpleFoam 
 -parallel | tee log.simpleFoam
[node047:11650] mca: base: components_register: registering framework 
plm components
[node047:11650] mca: base: components_register: found loaded component 
slurm
[node047:11650] mca: base: components_register: component slurm 
register function successful
[node047:11650] mca: base: components_register: found loaded component 
isolated
[node047:11650] mca: base: components_register: component isolated has 
no register or open function

[node047:11650] mca: base: components_register: found loaded component rsh
[node047:11650] mca: base: components_register: component rsh register 
function successful

[node047:11650] mca: base: components_open: opening plm components
[node047:11650] mca: base: components_open: found loaded component slurm
[node047:11650] mca: base: components_open: component slurm open 
function successful
[node047:11650] mca: base: components_open: found loaded component 
isolated
[node047:11650] mca: base: components_open: component isolated open 
function successful

[node047:11650] mca: base: components_open: found loaded component rsh
[node047:11650] mca: base: components_open: component rsh open 
function successful

[node047:11650] mca:base:select: Auto-selecting plm components
[node047:11650] mca:base:select:(  plm) Querying component [slurm]
[node047:11650] mca:base:select:(  plm) Query of component [slurm] set 
priority to 75

[node047:11650] mca:base:select:(  plm) Querying component [isolated]
[node047:11650] mca:base:select:(  plm) Query of component [isolated] 
set priority to 0

[node047:11650] mca:base:select:(  plm) Querying component [rsh]
[node047:11650] mca:base:select:(  plm) Query of component [rsh] set 
priority to 10

[node047:11650] mca:base:select:(  plm) Selected component [slurm]
[node047:11650] mca: base: close: component isolated closed
[node047:11650] mca: base: close: unloading component isolated
[node047:11650] mca: base: close: component rsh closed
[node047:11650] mca: base: close: unloading component rsh
[node047:11650] mca: base: components_register: registering framework 
ras components
[node047:11650] mca: base: components_register: found loaded component 
slurm
[node047:11650] mca: base: components_register: component slurm 
register function successful
[node047:11650] mca: base: components_register: found loaded component 
simulator
[node047:11650] mca: base: components_register: component simulator 
register function successful

[node047:11650] mca: base: components_open: opening ras components
[node047:11650] mca: base: components_open: found loaded component slurm
[node047:11650] mca: base: components_open: component slurm open 
function successful
[node047:11650] mca: base: components_open: found loaded component 
simulator

[node047:11650] mca:base:select: Auto-selecting ras components
[node047:11650] mca:base:select:(  ras) Querying component [slurm]
[node047:11650] mca:base:select:(  ras) Query of component [slurm] set 
priority to 50

[node047:11650] mca:base:select:(  ras) Querying component [simulator]
[node047:11650] mca:base:select:(  ras) Selected component [slurm]
[node047:11650] mca: base: close: unloading component simulator
[node047:11650] mca: base: components_register: registering framework 
rmaps components

Re: [OMPI users] ucx configuration

2023-01-11 Thread Gilles Gouaillardet via users

You can pick one test, make it standalone, and open an issue on GitHub.

How does (vanilla) Open MPI compare to your vendor Open MPI based library?

Cheers,

Gilles

On Wed, Jan 11, 2023 at 10:20 PM Dave Love via users <
users@lists.open-mpi.org> wrote:

> Gilles Gouaillardet via users  writes:
>
> > Dave,
> >
> > If there is a bug you would like to report, please open an issue at
> > https://github.com/open-mpi/ompi/issues and provide all the required
> > information
> > (in this case, it should also include the UCX library you are using and
> how
> > it was obtained or built).
>
> There are hundreds of failures I was interested in resolving with the
> latest versions, though I think somewhat fewer than with previous UCX
> versions.
>
> I'd like to know how it's recommended I should build to ensure I'm
> starting from the right place for any investigation.  Possible interplay
> between OMPI and UCX options seems worth understanding specifically, and
> I think it's reasonable to ask how to configure things to work together
> generally, when there are so many options without much explanation.
>
> I have tried raising issues previously without much luck but, given the
> number of failures, something is fundamentally wrong, and I doubt you
> want the output from the whole set.
>
> Perhaps the MPICH test set in a "portable" configuration is expected to
> fail with OMPI for some reason, and someone can comment on that.
> However, it's the only comprehensive set I know is available, and
> originally even IMB crashed, so I'm not inclined to blame the tests
> initially, and wonder how this stuff is tested.

Re: [OMPI users] ucx configuration

2023-01-07 Thread Gilles Gouaillardet via users

Dave,

If there is a bug you would like to report, please open an issue at
https://github.com/open-mpi/ompi/issues and provide all the required
information
(in this case, it should also include the UCX library you are using and how
it was obtained or built).

Cheers,

Gilles

On Fri, Jan 6, 2023 at 12:17 AM Dave Love via users <
users@lists.open-mpi.org> wrote:

> I see assorted problems with OMPI 4.1 on IB, including failing many of
> the mpich tests (non-mpich-specific ones) particularly with RMA.  Now I
> wonder if UCX build options could have anything to do with it, but I
> haven't found any relevant information.
>
> What configure options would be recommended with CUDA and ConnectX-5 IB?
> (This is on POWER, but I presume that's irrelevant.)  I assume they
> should be at least
>
> --enable-cma --enable-mt --with-cuda --with-gdrcopy --with-verbs
> --with-mlx5-dv
>
> but for a start I don't know what the relationship is between the cuda,
> shared memory, and multi-threading options in OMPI and UCX.
>
> Thanks for any enlightenment.

Re: [OMPI users] Question about "mca" parameters

2022-11-29 Thread Gilles Gouaillardet via users


Hi,


Simply add


btl = tcp,self


If the openib error message persists, try also adding

osc_rdma_btls = ugni,uct,ucp

or simply

osc = ^rdma



Cheers,


Gilles

On 11/29/2022 5:16 PM, Gestió Servidors via users wrote:


Hi,

If I run “mpirun --mca btl tcp,self --mca allow_ib 0 -n 12 
./my_program”, I get to disable some “extra” info in the output file like:


The OpenFabrics (openib) BTL failed to initialize while trying to

allocate some locked memory.  This typically can indicate that the

memlock limits are set too low.  For most HPC installations, the

memlock limits should be set to "unlimited".  The failure occured

here:

Local host:    clus11

OMPI source:   btl_openib.c:757

Function:  opal_free_list_init()

Device:    qib0

Memlock limit: 65536

You may need to consult with your system administrator to get this

problem fixed.  This FAQ entry on the Open MPI web site may also be

helpful:

http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages

--

[clus11][[33029,1],0][btl_openib.c:1062:mca_btl_openib_add_procs] 
could not prepare openib device for use


[clus11][[33029,1],1][btl_openib.c:1062:mca_btl_openib_add_procs] 
could not prepare openib device for use


[clus11][[33029,1],9][btl_openib.c:1062:mca_btl_openib_add_procs] 
could not prepare openib device for use


[clus11][[33029,1],8][btl_openib.c:1062:mca_btl_openib_add_procs] 
could not prepare openib device for use


[clus11][[33029,1],2][btl_openib.c:1062:mca_btl_openib_add_procs] 
could not prepare openib device for use


[clus11][[33029,1],6][btl_openib.c:1062:mca_btl_openib_add_procs] 
could not prepare openib device for use


[clus11][[33029,1],10][btl_openib.c:1062:mca_btl_openib_add_procs] 
could not prepare openib device for use


[clus11][[33029,1],11][btl_openib.c:1062:mca_btl_openib_add_procs] 
could not prepare openib device for use


[clus11][[33029,1],5][btl_openib.c:1062:mca_btl_openib_add_procs] 
could not prepare openib device for use


[clus11][[33029,1],3][btl_openib.c:1062:mca_btl_openib_add_procs] 
could not prepare openib device for use


[clus11][[33029,1],4][btl_openib.c:1062:mca_btl_openib_add_procs] 
could not prepare openib device for use


[clus11][[33029,1],7][btl_openib.c:1062:mca_btl_openib_add_procs] 
could not prepare openib device for use


or like

By default, for Open MPI 4.0 and later, infiniband ports on a device

are not used by default.  The intent is to use UCX for these devices.

You can override this policy by setting the btl_openib_allow_ib MCA 
parameter


to true.

Local host:  clus11

Local adapter:   qib0

Local port:  1

--

--

WARNING: There was an error initializing an OpenFabrics device.

Local host:   clus11

Local device: qib0

--

so, now, I would like to force that parameters in file 
$OMPI/etc/openmpi-mca-params.conf. I have run “ompi_info --param all 
all --level 9” to get all parameters, but I don’t know exactly what 
parameters I need to add to $OMPI/etc/openmpi-mca-params.conf and what 
is the correcty syntax of them to force always “--mca btl tcp,self 
--mca allow_ib 0”. I have already added “btl_openib_allow_ib = “ and 
it works, but for parametes “--mca btl tcp,self”, what would be the 
correct syntax in $OMPI/etc/openmpi-mca-params.conf file?


Thanks!!

Re: [OMPI users] CephFS and striping_factor

2022-11-28 Thread Gilles Gouaillardet via users


Hi Eric,


Currently, Open MPI does not provide specific support for CephFS.

MPI-IO is either implemented by ROMIO (imported from MPICH, it does not 
support CephFS today)


or the "native" ompio component (that also does not support CephFS today).


A proof of concept for CephFS in ompio might not be a huge work for 
someone motivated:


That could be as simple as (so to speak, since things are generally not 
easy) creating a new fs/ceph component


(e.g. in ompi/mca/fs/ceph) and implement the "file_open" callback that 
uses the ceph API.


I think the fs/lustre component can be used as an inspiration.


I cannot commit to do this, but if you are willing to take a crack at 
it, I can create such a component


so you can go directly to implementing the callback without spending too 
much time on some Open MPI internals


(e.g. component creation).



Cheers,


Gilles


On 11/29/2022 6:55 AM, Eric Chamberland via users wrote:

Hi,

I would like to know if OpenMPI is supporting file creation with 
"striping_factor" for CephFS?


According to CephFS library, I *think* it would be possible to do it 
at file creation with "ceph_open_layout".


https://github.com/ceph/ceph/blob/main/src/include/cephfs/libcephfs.h

Is it a possible futur enhancement?

Thanks,

Eric

Re: [OMPI users] Run on dual-socket system

2022-11-26 Thread Gilles Gouaillardet via users

Arham,

It should be balanced: the default mapping is to allocate NUMA packages
round robin.

you can
mpirun --report-bindings -n 28 true
to have Open MPI report the bindings

or

mpirun --tag-output -n 28 grep Cpus_allowed_list /proc/self/status

to have each task report which physical cpu it is bound.


Cheers,

Gilles

On Sat, Nov 26, 2022 at 5:38 PM Arham Amouei via users <
users@lists.open-mpi.org> wrote:

> Hi
>
> If I run a code with
>
> mpirun -n 28 ./code
>
> Is it guaranteed that Open MPI and/or OS give equal number of processes to
> each socket? Or I have to use some mpirun options?
>
> Running the code with the command given above, one socket gets much hotter
> than the other (60°C vs 80°C). I'm sure that the code itself divides the
> job equally among the processes.
>
> The system is Dell Precision 7910. Two Xeon E5-2680 v4 and two 16GB 2400
> RAM modules are installed. There are a total number of 28 physical cores.
> The total number of logical cores is 56. The OS is Ubuntu 22.04.
>
> Thank in advance
> Arham
>
>
>
>

Re: [OMPI users] users Digest, Vol 4818, Issue 1

2022-11-14 Thread Gilles Gouaillardet via users

    Flags: DAEMON_LAUNCHED:LOCATION_VERIFIED:SLOTS_GIVEN
    aliases: 192.168.180.48
    hepslustretest03: slots=1 max_slots=0 slots_inuse=0 state=UP
    Flags: DAEMON_LAUNCHED:LOCATION_VERIFIED:SLOTS_GIVEN
    aliases: 
192.168.60.203,hepslustretest03.ihep.ac.cn,172.17.180.203,172.168.10.23,172.168.10.143

=
[computer01:39342] mca:rmaps: mapping job prterun-computer01-39342@1
[computer01:39342] mca:rmaps: setting mapping policies for job 
prterun-computer01-39342@1 inherit TRUE hwtcpus FALSE

[computer01:39342] mca:rmaps[358] mapping not given - using bycore
[computer01:39342] setdefaultbinding[365] binding not given - using bycore
[computer01:39342] mca:rmaps:ppr: job prterun-computer01-39342@1 not 
using ppr mapper PPR NULL policy PPR NOTSET
[computer01:39342] mca:rmaps:seq: job prterun-computer01-39342@1 not 
using seq mapper

[computer01:39342] mca:rmaps:rr: mapping job prterun-computer01-39342@1
[computer01:39342] AVAILABLE NODES FOR MAPPING:
[computer01:39342] node: computer01 daemon: 0 slots_available: 1
[computer01:39342] mca:rmaps:rr: mapping by Core for job 
prterun-computer01-39342@1 slots 1 num_procs 2

--
There are not enough slots available in the system to satisfy the 2
slots that were requested by the application:

  which

Either request fewer procs for your application, or make more slots
available for use.

A "slot" is the PRRTE term for an allocatable unit where we can
launch a process.  The number of slots available are defined by the
environment in which PRRTE processes are run:

  1. Hostfile, via "slots=N" clauses (N defaults to number of
 processor cores if not provided)
  2. The --host command line parameter, via a ":N" suffix on the
 hostname (N defaults to 1 if not provided)
  3. Resource manager (e.g., SLURM, PBS/Torque, LSF, etc.)
  4. If none of a hostfile, the --host command line parameter, or an
 RM is present, PRRTE defaults to the number of processor cores

In all the above cases, if you want PRRTE to default to the number
of hardware threads instead of the number of processor cores, use the
--use-hwthread-cpus option.

Alternatively, you can use the --map-by :OVERSUBSCRIBE option to 
ignore the

number of available slots when deciding the number of processes to
launch.
--

在 2022/11/15 02:04, users-requ...@lists.open-mpi.org 写道:

Send users mailing list submissions to
users@lists.open-mpi.org

To subscribe or unsubscribe via the World Wide Web, visit
https://lists.open-mpi.org/mailman/listinfo/users
or, via email, send a message with subject or body 'help' to
users-requ...@lists.open-mpi.org

You can reach the person managing the list at
users-ow...@lists.open-mpi.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of users digest..."


Today's Topics:

1. Re: [OMPI devel] There are not enough slots available in the
   system to satisfy the 2, slots that were requested by the
   application (Jeff Squyres (jsquyres))
2. Re: Tracing of openmpi internal functions
   (Jeff Squyres (jsquyres))


--

Message: 1
Date: Mon, 14 Nov 2022 17:04:24 +
From: "Jeff Squyres (jsquyres)"
To: Open MPI Users
Subject: Re: [OMPI users] [OMPI devel] There are not enough slots
available in the system to satisfy the 2, slots that were requested by
the application
Message-ID:



Content-Type: text/plain; charset="utf-8"

Yes, somehow I'm not seeing all the output that I expect to see.  Can you ensure that if you're copy-and-pasting from 
the email, that it's actually using "dash dash" in front of "mca" and "machinefile" (vs. 
a copy-and-pasted "em dash")?

--
Jeff Squyres
jsquy...@cisco.com

From: users  on behalf of Gilles Gouaillardet via 
users
Sent: Sunday, November 13, 2022 9:18 PM
To: Open MPI Users
Cc: Gilles Gouaillardet
Subject: Re: [OMPI users] [OMPI devel] There are not enough slots available in 
the system to satisfy the 2, slots that were requested by the application

There is a typo in your command line.
You should use --mca (minus minus) instead of -mca

Also, you can try --machinefile instead of -machinefile

Cheers,

Gilles

There are not enough slots available in the system to satisfy the 2
slots that were requested by the application:

   ?mca

On Mon, Nov 14, 2022 at 11:04 AM timesir via users 
mailto:users@lists.open-mpi.org>> wrote:

(py3.9) ?  /share  mpirun -n 2 -machinefile hosts ?mca rmaps_base_verbose 100 
--mca ras_base_verbose 100  which mpirun
[computer01:04570] mca: base: component_find: searching NU

Re: [OMPI users] [OMPI devel] There are not enough slots available in the system to satisfy the 2, slots that were requested by the application

2022-11-13 Thread Gilles Gouaillardet via users

There is a typo in your command line.
You should use --mca (minus minus) instead of -mca

Also, you can try --machinefile instead of -machinefile

Cheers,

Gilles

There are not enough slots available in the system to satisfy the 2
slots that were requested by the application:

  –mca

On Mon, Nov 14, 2022 at 11:04 AM timesir via users 
wrote:

> *(py3.9) ➜  /share  mpirun -n 2 -machinefile hosts –mca rmaps_base_verbose
> 100 --mca ras_base_verbose 100  which mpirun*
> [computer01:04570] mca: base: component_find: searching NULL for ras
> components
> [computer01:04570] mca: base: find_dyn_components: checking NULL for ras
> components
> [computer01:04570] pmix:mca: base: components_register: registering
> framework ras components
> [computer01:04570] pmix:mca: base: components_register: found loaded
> component simulator
> [computer01:04570] pmix:mca: base: components_register: component
> simulator register function successful
> [computer01:04570] pmix:mca: base: components_register: found loaded
> component pbs
> [computer01:04570] pmix:mca: base: components_register: component pbs
> register function successful
> [computer01:04570] pmix:mca: base: components_register: found loaded
> component slurm
> [computer01:04570] pmix:mca: base: components_register: component slurm
> register function successful
> [computer01:04570] mca: base: components_open: opening ras components
> [computer01:04570] mca: base: components_open: found loaded component
> simulator
> [computer01:04570] mca: base: components_open: found loaded component pbs
> [computer01:04570] mca: base: components_open: component pbs open function
> successful
> [computer01:04570] mca: base: components_open: found loaded component slurm
> [computer01:04570] mca: base: components_open: component slurm open
> function successful
> [computer01:04570] mca:base:select: Auto-selecting ras components
> [computer01:04570] mca:base:select:(  ras) Querying component [simulator]
> [computer01:04570] mca:base:select:(  ras) Querying component [pbs]
> [computer01:04570] mca:base:select:(  ras) Querying component [slurm]
> [computer01:04570] mca:base:select:(  ras) No component selected!
>
> ==   ALLOCATED NODES
> ==
> [10/1444]
> computer01: slots=1 max_slots=0 slots_inuse=0 state=UP
> Flags: DAEMON_LAUNCHED:LOCATION_VERIFIED:SLOTS_GIVEN
> aliases: 192.168.180.48
> 192.168.60.203: slots=1 max_slots=0 slots_inuse=0 state=UNKNOWN
> Flags: SLOTS_GIVEN
> aliases: NONE
> =
>
> ==   ALLOCATED NODES   ==
> computer01: slots=1 max_slots=0 slots_inuse=0 state=UP
> Flags: DAEMON_LAUNCHED:LOCATION_VERIFIED:SLOTS_GIVEN
> aliases: 192.168.180.48
> hepslustretest03: slots=1 max_slots=0 slots_inuse=0 state=UP
> Flags: DAEMON_LAUNCHED:LOCATION_VERIFIED:SLOTS_GIVEN
> aliases: 192.168.60.203,172.17.180.203,172.168.10.23,172.168.10.143
> =
> --
> There are not enough slots available in the system to satisfy the 2
> slots that were requested by the application:
>
>   –mca
>
> Either request fewer procs for your application, or make more slots
> available for use.
>
> A "slot" is the PRRTE term for an allocatable unit where we can
> launch a process.  The number of slots available are defined by the
> environment in which PRRTE processes are run:
>
>   1. Hostfile, via "slots=N" clauses (N defaults to number of
>  processor cores if not provided)
>   2. The --host command line parameter, via a ":N" suffix on the
>  hostname (N defaults to 1 if not provided)
>   3. Resource manager (e.g., SLURM, PBS/Torque, LSF, etc.)
>   4. If none of a hostfile, the --host command line parameter, or an
>  RM is present, PRRTE defaults to the number of processor cores
>
> In all the above cases, if you want PRRTE to default to the number
> of hardware threads instead of the number of processor cores, use the
> --use-hwthread-cpus option.
>
> Alternatively, you can use the --map-by :OVERSUBSCRIBE option to ignore the
> number of available slots when deciding the number of processes to
> launch.
> --
>
>
>
> 在 2022/11/13 23:42, Jeff Squyres (jsquyres) 写道:
>
> Interesting.  It says:
>
> [computer01:106117] AVAILABLE NODES FOR MAPPING:
> [computer01:106117] node: computer01 daemon: 0 slots_available: 1
>
> This is why it tells you you're out of slots: you're asking for 2, but it
> only found 1.  This means it's not seeing your hostfile somehow.
>
> I should have asked you to run with *2* variables last time -- can you
> re-run with "mpirun --mca rmaps_base_verbose 100 --mca ras_base_verbose 100
> ..."?
>
> Turning on the RAS verbosity should show us what the

Re: [OMPI users] lots of undefined symbols compiling a hello-world

2022-11-05 Thread Gilles Gouaillardet via users

Chris,

Did you double check libopen-rte.so.40 and libopen-pal.so.40 are installed
in /mnt/software/o/openmpi/4.1.4-ct-test/lib?

If they are not present, it means your install is busted and you should try
to reinstall it.


Cheers,

Gilles

On Sat, Nov 5, 2022 at 3:42 AM Chris Taylor via users <
users@lists.open-mpi.org> wrote:

> I built 4.1.4 from a tarball using gcc 11.1.0 with just --prefix as an
> option for configure. I get a long list of errors trying to compile a
> hello-world. I know it's something simple but I'm at a loss as to how to
> troubleshoot-
>
>
>
> $ mpicc -o hello-world hello-world.c
>
> /mnt/software/g/gcc/gcctoolchain/gcctoolchain-release_10.2.0.125178/gcc/gcc_11.1.0.pbi01/build-libc_2.17/target-libc_2.17/binnowrap/x86_64-libc_2.17-linux-gnu-ld.bfd:
> warning: libopen-rte.so.40, needed by
> /mnt/software/o/openmpi/4.1.4-ct-test/lib/libmpi.so, not found (try using
> -rpath or -rpath-link)
>
> /mnt/software/g/gcc/gcctoolchain/gcctoolchain-release_10.2.0.125178/gcc/gcc_11.1.0.pbi01/build-libc_2.17/target-libc_2.17/binnowrap/x86_64-libc_2.17-linux-gnu-ld.bfd:
> warning: libopen-pal.so.40, needed by
> /mnt/software/o/openmpi/4.1.4-ct-test/lib/libmpi.so, not found (try using
> -rpath or -rpath-link)
>
> /mnt/software/g/gcc/gcctoolchain/gcctoolchain-release_10.2.0.125178/gcc/gcc_11.1.0.pbi01/build-libc_2.17/target-libc_2.17/binnowrap/x86_64-libc_2.17-linux-gnu-ld.bfd:
> /mnt/software/o/openmpi/4.1.4-ct-test/lib/libmpi.so: undefined reference to
> `mca_base_framework_components_close'
>
> /mnt/software/g/gcc/gcctoolchain/gcctoolchain-release_10.2.0.125178/gcc/gcc_11.1.0.pbi01/build-libc_2.17/target-libc_2.17/binnowrap/x86_64-libc_2.17-linux-gnu-ld.bfd:
> /mnt/software/o/openmpi/4.1.4-ct-test/lib/libmpi.so: undefined reference to
> `opal_list_sort'
>
>
>
> ...
>
> ...
>
>
>
>
>
> $ mpicc --showme:link
>
> -pthread -L/mnt/software/o/openmpi/4.1.4-ct-test/lib -Wl,-rpath
> -Wl,/mnt/software/o/openmpi/4.1.4-ct-test/lib -Wl,--enable-new-dtags -lmpi
>
> $ mpicc --showme:compile
>
> -I/mnt/software/o/openmpi/4.1.4-ct-test/include -pthread
>
>
>

Re: [OMPI users] ifort and openmpi

2022-09-15 Thread Gilles Gouaillardet via users

Volker,

https://ntq1982.github.io/files/20200621.html (mentioned in the ticket)
suggests that patching the generated configure file can do the trick.

We already patch the generated configure file in autogen.pl (if the
patch_autotools_output subroutine), so I guess that could be enhanced
to support Intel Fortran on OSX.

I am confident a Pull Request that does fix this issue will be considered
for inclusion in future Open MPI releases.


Cheers,

Gilles

On Fri, Sep 16, 2022 at 11:20 AM Volker Blum via users <
users@lists.open-mpi.org> wrote:

> Hi all,
>
> This issue here:
>
> https://github.com/open-mpi/ompi/issues/7615
>
> is, unfortunately, still current.
>
> I understand that within OpenMPI there is a sense that this is Intel's
> problem but I’m not sure it is. Is it possible to address this in the
> configure script in the actual OpenMPI distribution in some form?
>
> There are more issues with OpenMPI + Intel + scalapack, but this is the
> first one that strikes. Eventually, the problem just renders a Macbook
> unusable as a computing tool since the only way it seems to run is with
> libraries from Homebrew (this works), but that appears to introduce
> unoptimized BLAS libraries - very slow. It’s the only working MPI setup
> that I could construct, though.
>
> I know that one can take the view that Intel Fortran on Mac is just broken
> for the default configure process, but it seems like a strange standoff to
> me. It would be much better to see this worked out in some way.
>
> Does anyone have a solution for this issue that could be merged into the
> actual configure script distributed with OpenMPI, rather than having to
> track down a fairly arcane addition(*) and apply it by hand?
>
> Sorry … I know this isn’t the best way of raising the issue but then, it
> is also tiring to spend hours on an already large build process and find
> that the issue is still there. If there was some way to figure this out so
> as to at least not affect OpenMPI, I suspect that would help a lot of
> users. Would anyone be willing to revisit the 2020 decision?
>
> Thank you!
>
> Best wishes
> Volker
>
> (*) I know about the patch in the README:
>
> - Users have reported (see
>   https://github.com/open-mpi/ompi/issues/7615) that the Intel Fortran
>   compiler will fail to link Fortran-based MPI applications on macOS
>   with linker errors similar to this:
>
>   Undefined symbols for architecture x86_64:
> "_ompi_buffer_detach_f08", referenced from:
> import-atom in libmpi_usempif08.dylib
>   ld: symbol(s) not found for architecture x86_64
>
>   It appears that setting the environment variable
>   lt_cx_ld_force_load=no before invoking Open MPI's configure script
>   works around the issue.  For example:
>
>   shell$ lt_cv_ld_force_load=no ./configure …
>
> This is nice but it does not help stop the issue from striking unless one
> reads a very long file in detail first. Isn’t this perhaps something that
> the configure script itself should be able to catch if it detects ifort?
>
>
>

Re: [OMPI users] Hardware topology influence

2022-09-13 Thread Gilles Gouaillardet via users

Lucas,

the number of MPI tasks started by mpirun is either
 - explicitly passed via the command line (e.g. mpirun -np 2306 ...)
 - equals to the number of available slots, and this value is either
 a) retrieved from the resource manager (such as a SLURM allocation)
 b) explicitly set in a machine file (e.g. mpirun -machinefile
 ...) or the command line
 (e.g. mpirun --hosts host0:96,host1:96 ...)
 c) if none of the above is set, the number of detected cores on the
system

Cheers,

Gilles

On Tue, Sep 13, 2022 at 9:23 PM Lucas Chaloyard via users <
users@lists.open-mpi.org> wrote:

> Hello,
>
> I'm working as a research intern in a lab where we're studying
> virtualization.
> And I've been working with several benchmarks using OpenMPI 4.1.0 (ASKAP,
> GPAW and Incompact3d from Phrononix Test suite).
>
> To briefly explain my experiments, I'm running those benchmarks on several
> virtual machines using different topologies.
> During one experiment I've been comparing those two topologies :
> - Topology1 : 96 vCPUS divided in 96 sockets containing 1 threads
> - Topology2 : 96 vCPUS divided in 48 sockets containing 2 threads (usage
> of hyperthreading)
>
> For the ASKAP Benchmark :
> - While using Topology2, 2306 processes will be created by the application
> to do its work.
> - While using Topology1, 4612 processes will be created by the application
> to do its work.
> This is also happening when running GPAW and Incompact3d benchmarks.
>
> What I've been wondering (and looking for) is, does OpenMPI take into
> account the topology, and reduce the number of processes create to execute
> its work in order to avoid the usage of hyperthreading ?
> Or is it something done by the application itself ?
>
> I was looking at the source code, and I've been trying to find how and
> when are filled the information about the MPI_COMM_WORLD communicator, to
> see if the 'num_procs' field depends on the topology, but I didn't have any
> chance for now.
>
> Respectfully, Chaloyard Lucas.
>

Re: [OMPI users] Using MPI in Toro unikernel

2022-07-24 Thread Gilles Gouaillardet via users

Matias,

Assuming you run one MPI task per unikernel, and two unikernels share
nothing,
it means that inter-node communication cannot be performed via shared
memory or kernel feature
(such as xpmem or knem). That also implies communication are likely using
the loopback interface
which is much slower.

Cheers,

Gilles

On Sun, Jul 24, 2022 at 8:11 PM Matias Ezequiel Vara Larsen via users <
users@lists.open-mpi.org> wrote:

> Hello everyone,
>
> I have started to play with MPI and unikernels and I have recently
> implemented a minimal set of MPI APIs on top of Toro Unikernel
> (
> https://github.com/torokernel/torokernel/blob/example-mpi/examples/MPI/MpiReduce.pas
> ).
> I was wondering if someone may be interested in the use of unikernels to
> deploy MPI applications. Toro is a shared-nothing unikernel in which
> each core runs independently from one another. Also, memory is per-core
> to leverage NUMA. I was thinking that those features may improve the
> execution of MPI applications but I have not measured that yet. For the
> moment, I am running a simple MPI reduction with the MPI_SUM operation
> and watching how this behaves when the number of cores increases. Do you
> know any benchmark that I can run so try to test that?
>
> Matias
>

Re: [OMPI users] Intercommunicator issue (any standard about communicator?)

2022-06-24 Thread Gilles Gouaillardet via users

Guillaume,

MPI_Comm is an opaque handler that should not be interpreted by an end user.

Open MPI chose to implement is as an opaque pointer, and MPICH chose to
implement it as a 32 bits unsigned integer.
The 4400 value strongly suggests you are using MPICH and you are hence
posting to the wrong mailing list


Cheers,

Gilles

On Fri, Jun 24, 2022 at 9:06 PM Guillaume De Nayer via users <
users@lists.open-mpi.org> wrote:

> Hi Gilles,
>
> MPI_COMM_WORLD is positive (4400).
>
> In a short code I wrote I have something like that:
>
> MPI_Comm_dup(MPI_COMM_WORLD, );
> cout << "intra-communicator: " << "world" << "---" << hex << world << endl;
>
> It returns "8406" (in hex).
>
> later I have:
>
> MPI_Comm_accept(port_name, MPI_INFO_NULL, 0, world, );
> cout << "intercommunicator interClient=" << interClient << endl;
>
> After connection from a third party client it returns "c403" (in hex).
>
> Both 8406 and c403 are negative integer in dec.
>
> I don't know if it is "normal". Therefore I'm looking about rules on the
> communicators, intercommunicators.
>
> Regards,
> Guillaume
>
>
> On 06/24/2022 11:56 AM, Gilles Gouaillardet via users wrote:
> > Guillaume,
> >
> > what do you mean by (the intercommunicators are all negative"?
> >
> >
> > Cheers,
> >
> > Gilles
> >
> > On Fri, Jun 24, 2022 at 4:23 PM Guillaume De Nayer via users
> > mailto:users@lists.open-mpi.org>> wrote:
> >
> > Hi,
> >
> > I am new on this list. Let me introduce myself shortly: I am a
> > researcher in fluid mechanics. In this context I am using softwares
> > related on MPI.
> >
> > I am facing a problem:
> > - 3 programs forms a computational framework. Soft1 is a coupling
> > program, i.e., it opens an MPI port at the beginning. Soft2 and Soft3
> > are clients, which connect to the coupling program using
> > MPI_Comm_connect.
> > - After the start and the connections of Soft2 and Soft3 with Soft1,
> it
> > hangs.
> >
> > I started to debug this issue and as usual I found another issue (or
> > perhaps it is not an issue):
> > - The intercommunicators I get between Soft1-Soft2 and Soft1-Soft3
> are
> > all negative (running on CentOS 7 with infiniband Mellanox OFED
> driver).
> > - Is there some standard about communicator? I don't find anything
> > about
> > this topic.
> > - What is a valid communicator, intercommunicator?
> >
> > thx a lot
> > Regards
> > Guillaume
> >
>
>
>

Re: [OMPI users] OpenMPI and names of the nodes in a cluster

2022-06-24 Thread Gilles Gouaillardet via users

Sorry if I did not make my intent clear.

I was basically suggesting to hack the Open MPI and PMIx wrappers to 
hostname() and remove the problematic underscores to make the regx 
components a happy panda again.

Cheers,

Gilles

- Original Message -
> I think the files suggested by Gilles are more about the underlying 
call to get the hostname; those won't be problematic.
> 
> The regex Open MPI modules are where Open MPI is running into a 
problem with your hostnames (i.e., your hostnames don't fit into Open 
MPI's expectations of the format of the hostname).  I'm surprised that 
using the naive module (instead of the fwd module) doesn't solve your 
problem.  ...oh shoot, I see why.  It's because I had a typo in what I 
suggested to you.
> 
> Please try:  mpirun --mca regx naive ...
> 
> (i.e., "regx", not "regex")
> 
> --
> Jeff Squyres
> jsquy...@cisco.com
> 
> 
> From: Patrick Begou 
> Sent: Tuesday, June 21, 2022 12:10 PM
> To: Jeff Squyres (jsquyres); Open MPI Users
> Subject: Re: [OMPI users] OpenMPI and names of the nodes in a cluster
> 
> Hi Jeff,
> 
> Unfortunately the workaround with "--mca regex naive" does not change 
the behaviour. I'm going to investigate OpenMPI sources files as 
suggested by Gilles.
> 
> Patrick
> 
> Le 16/06/2022 à 17:43, Jeff Squyres (jsquyres) a écrit :
> 
> Ah; this is a slightly different error than what Gilles was guessing 
from your prior description.  This is what you're running in to: 
https://github.com/open-mpi/ompi/blob/v4.0.x/orte/mca/regx/fwd/regx_fwd.c#L130-L134

> 
> Try running with:
> 
> mpirun --mca regex naive ...
> 
> Specifically: the "fwd" regex component is selected by default, but it 
has certain expectations about the format of hostnames.  Try using the "
naive" regex component, instead.
> 
> --
> Jeff Squyres
> jsquy...@cisco.com
> 
> 
> From: Patrick Begou 
> Sent: Thursday, June 16, 2022 9:48 AM
> To: Jeff Squyres (jsquyres); Open MPI Users
> Subject: Re: [OMPI users] OpenMPI and names of the nodes in a cluster
> 
> Hi  Gilles and Jeff,
> 
> @Gilles I will have a look at these files, thanks.
> 
> @Jeff this is the error message (screen dump attached) and of course 
the nodes names do not agree with the standard.
> 
> Patrick
> 
> [cid:part1.KfzAgK4Q.PG6VadQJ@univ-grenoble-alpes.fr]
> 
> Le 16/06/2022 à 14:30, Jeff Squyres (jsquyres) a écrit :
> 
> What exactly is the error that is occurring?
> 
> --
> Jeff Squyres
> jsquy...@cisco.com
> 
> 
> From: users  on behalf of Patrick Begou via users <
users@lists.open-mpi.org>
> Sent: Thursday, June 16, 2022 3:21 AM
> To: Open MPI Users
> Cc: Patrick Begou
> Subject: [OMPI users] OpenMPI and names of the nodes in a cluster
> 
> Hi all,
> 
> we are facing a serious problem with OpenMPI (4.0.2) that we have
> deployed on a cluster. We do not manage this large cluster and the 
names
> of the nodes do not agree with Internet standards for protocols: they
> contain a "_" (underscore) character.
> 
> So OpenMPI complains about this and do not run.
> 
> I've tried to use IP instead of host names in the host file without 
any
> success.
> 
> Is there a known workaround for this as requesting the administrators 
to
> change the nodes names on this large cluster may be difficult.
> 
> Thanks
> 
> Patrick
> 
> 
> 
> 
> 
> 
>

Re: [OMPI users] OpenMPI and names of the nodes in a cluster

2022-06-16 Thread Gilles Gouaillardet via users

Patrick,

you will likely also need to apply the same hack to opal_net_get_hostname()
in opal/util/net.c


Cheers,

Gilles

On Thu, Jun 16, 2022 at 7:30 PM Gilles Gouaillardet <
gilles.gouaillar...@gmail.com> wrote:

> Patrick,
>
> I am not sure Open MPI can do that out of the box.
>
> Maybe hacking pmix_net_get_hostname() in
> opal/mca/pmix/pmix3x/pmix/src/util/net.c
>
> can do the trick.
>
>
> Cheers,
>
> Gilles
>
> On Thu, Jun 16, 2022 at 4:24 PM Patrick Begou via users <
> users@lists.open-mpi.org> wrote:
>
>> Hi all,
>>
>> we are facing a serious problem with OpenMPI (4.0.2) that we have
>> deployed on a cluster. We do not manage this large cluster and the names
>> of the nodes do not agree with Internet standards for protocols: they
>> contain a "_" (underscore) character.
>>
>> So OpenMPI complains about this and do not run.
>>
>> I've tried to use IP instead of host names in the host file without any
>> success.
>>
>> Is there a known workaround for this as requesting the administrators to
>> change the nodes names on this large cluster may be difficult.
>>
>> Thanks
>>
>> Patrick
>>
>>
>>

Re: [OMPI users] OpenMPI and names of the nodes in a cluster

2022-06-16 Thread Gilles Gouaillardet via users

Patrick,

I am not sure Open MPI can do that out of the box.

Maybe hacking pmix_net_get_hostname() in
opal/mca/pmix/pmix3x/pmix/src/util/net.c

can do the trick.


Cheers,

Gilles

On Thu, Jun 16, 2022 at 4:24 PM Patrick Begou via users <
users@lists.open-mpi.org> wrote:

> Hi all,
>
> we are facing a serious problem with OpenMPI (4.0.2) that we have
> deployed on a cluster. We do not manage this large cluster and the names
> of the nodes do not agree with Internet standards for protocols: they
> contain a "_" (underscore) character.
>
> So OpenMPI complains about this and do not run.
>
> I've tried to use IP instead of host names in the host file without any
> success.
>
> Is there a known workaround for this as requesting the administrators to
> change the nodes names on this large cluster may be difficult.
>
> Thanks
>
> Patrick
>
>
>

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-05 Thread Gilles Gouaillardet via users

Scott,

I am afraid this test is inconclusive since stdout is processed by mpirun.

What if you
mpirun -np 1 touch /tmp/xyz

abort (since it will likely hang) and
ls -l /tmp/xyz

In my experience on mac, this kind of hangs can happen if you are running a
firewall and/or the IP of your host does not match the hostname


Cheers,

Gilles

On Thu, May 5, 2022 at 5:06 AM Scott Sayres via users <
users@lists.open-mpi.org> wrote:

> foo.sh is executable, again hangs without output.
> I command c x2 to return to shell, then
>
> ps auxwww | egrep 'mpirun|foo.sh'
> output shown below
>
> scottsayres@scotts-mbp trouble-shoot % ./foo.sh
>
> Wed May  4 12:59:15 MST 2022
>
> Wed May  4 12:59:16 MST 2022
>
> Wed May  4 12:59:17 MST 2022
>
> Wed May  4 12:59:18 MST 2022
>
> Wed May  4 12:59:19 MST 2022
>
> Wed May  4 12:59:20 MST 2022
>
> Wed May  4 12:59:21 MST 2022
>
> Wed May  4 12:59:22 MST 2022
>
> Wed May  4 12:59:23 MST 2022
>
> Wed May  4 12:59:24 MST 2022
>
> scottsayres@scotts-mbp trouble-shoot % mpirun -np 1 foo.sh
>
> ^C^C*%*
>
>   scottsayres@scotts-mbp trouble-shoot % ps auxwww | egrep
> 'mpirun|foo.sh'
>
> scottsayres  91795 100.0  0.0 409067920   1456 s002  R12:59PM
> 0:14.07 mpirun -np 1 foo.sh
>
> scottsayres  91798   0.0  0.0 408628368   1632 s002  S+1:00PM
> 0:00.00 egrep mpirun|foo.sh
>
> scottsayres@scotts-mbp trouble-shoot %
>
>
> On Wed, May 4, 2022 at 12:42 PM Jeff Squyres (jsquyres) <
> jsquy...@cisco.com> wrote:
>
>> That backtrace seems to imply that the launch may not have completed.
>>
>> Can you make an executable script foo.sh with:
>>
>> #!/bin/bash
>>
>>
>> i=0
>>
>> while test $i -lt 10; do
>>
>> date
>>
>> sleep 1
>>
>> let i=$i+1
>>
>> done
>>
>>
>> Make sure that foo.sh is executable and then run it via:
>>
>> mpirun -np 1 foo.sh
>>
>> If you start seeing output, good!If it completes, better!
>>
>> If it hangs, and/or if you don't see any output at all, do this:
>>
>> ps auxwww | egrep 'mpirun|foo.sh'
>>
>> It should show mpirun and 2 copies of foo.sh (and probably a grep).  Does
>> it?
>>
>> --
>> Jeff Squyres
>> jsquy...@cisco.com
>>
>> 
>> From: Scott Sayres 
>> Sent: Wednesday, May 4, 2022 2:47 PM
>> To: Open MPI Users
>> Cc: Jeff Squyres (jsquyres)
>> Subject: Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3
>>
>> Following Jeff's advice, I have rebuilt open-mpi by hand using the -g
>> option.   This shows more information as below.   I am attempting George's
>> advice of how to track the child but notice that gdb does not support
>> arm64.  attempting to update lldb.
>>
>>
>> scottsayres@scotts-mbp openmpi-4.1.3 % lldb mpirun -- -np 1 hostname
>>
>> (lldb) target create "mpirun"
>>
>> Current executable set to 'mpirun' (arm64).
>>
>> (lldb) settings set -- target.run-args  "-np" "1" "hostname"
>>
>> (lldb) run
>>
>> Process 90950 launched: '/usr/local/bin/mpirun' (arm64)
>>
>> Process 90950 stopped
>>
>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
>>
>> frame #0: 0x0001bde25654 libsystem_kernel.dylib`read + 8
>>
>> libsystem_kernel.dylib`read:
>>
>> ->  0x1bde25654 <+8>:  b.lo   0x1bde25674   ; <+40>
>>
>> 0x1bde25658 <+12>: pacibsp
>>
>> 0x1bde2565c <+16>: stpx29, x30, [sp, #-0x10]!
>>
>> 0x1bde25660 <+20>: movx29, sp
>>
>> Target 0: (mpirun) stopped.
>>
>> (lldb) ^C
>>
>> (lldb) thread backtrace
>>
>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
>>
>>   * frame #0: 0x0001bde25654 libsystem_kernel.dylib`read + 8
>>
>> frame #1: 0x00010056169c libopen-pal.40.dylib`opal_fd_read(fd=27,
>> len=20, buffer=0x00016fdfe90c) at fd.c:51:14
>>
>> frame #2: 0x0001027b3388
>> mca_odls_default.so`do_parent(cd=0x63e0, read_fd=27) at
>> odls_default_module.c:495:14
>>
>> frame #3: 0x0001027b2d90
>> mca_odls_default.so`odls_default_fork_local_proc(cdptr=0x63e0)
>> at odls_default_module.c:651:12
>>
>> frame #4: 0x0001003246f8
>> libopen-rte.40.dylib`orte_odls_base_spawn_proc(fd=-1, sd=4,
>> cbdata=0x63e0) at odls_base_default_fns.c:1046:31
>>
>> frame #5: 0x00010057a7a0
>> libopen-pal.40.dylib`opal_libevent2022_event_base_loop [inlined]
>> event_process_active_single_queue(base=0x0001007061c0) at
>> event.c:1370:4 [opt]
>>
>> frame #6: 0x00010057a628
>> libopen-pal.40.dylib`opal_libevent2022_event_base_loop [inlined]
>> event_process_active(base=0x0001007061c0) at event.c:1440:8 [opt]
>>
>> frame #7: 0x00010057a5ec
>> libopen-pal.40.dylib`opal_libevent2022_event_base_loop(base=0x0001007061c0,
>> flags=) at event.c:1644:12 [opt]
>>
>> frame #8: 0x00013b04 mpirun`orterun(argc=4,
>> argv=0x00016fdff268) at orterun.c:179:9
>>
>> frame #9: 0x00013904 mpirun`main(argc=4,
>> argv=0x00016fdff268) at main.c:13:12
>>
>> frame #10: 0x000100015088 dyld`start + 516
>>

Re: [OMPI users] mpi-test-suite shows errors on openmpi 4.1.x

2022-05-03 Thread Gilles Gouaillardet via users

Alois,

Thanks for the report.

FWIW, I am not seeing any errors on my Mac with Open MPI from brew (4.1.3)

How many MPI tasks are you running?
Can you please confirm you can evidence the error with

mpirun -np  ./mpi_test_suite -d MPI_TYPE_MIX_ARRAY -c
0 -t collective


Also, can you try the same command with
mpirun --mca pml ob1 --mca btl tcp,self ...

Cheers,

Gilles

On Tue, May 3, 2022 at 7:08 PM Alois Schlögl via users <
users@lists.open-mpi.org> wrote:

>
> Within our cluster (debian10/slurm16, debian11/slurm20), with
> infiniband, and we have several instances of openmpi installed through
> the Lmod module system. When testing the openmpi installations with the
> mpi-test-suite 1.1 [1], it shows errors like these
>
> ...
> Rank:0) tst_test_array[45]:Allreduce Min/Max with MPI_IN_PLACE
> (Rank:0) tst_test_array[46]:Allreduce Sum
> (Rank:0) tst_test_array[47]:Alltoall
> Number of failed tests: 130
> Summary of failed tests:
> ERROR class:P2P test:Ring Send Pack (7), comm Duplicated MPI_COMM_WORLD
> (4), type MPI_TYPE_MIX (27) number of values:1000
> ERROR class:P2P test:Ring Send Pack (7), comm Duplicated MPI_COMM_WORLD
> (4), type MPI_TYPE_MIX_ARRAY (28) number of values:1000
> ...
>
> when using openmpi/4.1.x (i tested with 4.1.1 and 4.1.3)  The number of
> errors may vary, but the first errors are always about
> ERROR class:P2P test:Ring Send Pack (7), comm Duplicated MPI_COMM_WORLD
>
> When testing on openmpi/3.1.3, the tests runs successfully, and there
> are no failed tests.
>
> Typically, the openmpi/4.1.x installation is configured with
>  ./configure --prefix=${PREFIX} \
>  --with-ucx=$UCX_HOME \
>  --enable-orterun-prefix-by-default  \
>  --enable-mpi-cxx \
>  --with-hwloc \
>  --with-pmi \
>  --with-pmix \
>  --with-cuda=$CUDA_HOME \
>  --with-slurm
>
> but I've also tried different compilation options including w/ and w/o
> --enable-mpi1-compatibility, w/ and w/o ucx, using hwloc from the OS, or
> compiled from source. But I could not identify any pattern.
>
> Therefore, I'd like asking you what the issue might be. Specifically,
> I'm would like to know:
>
> - Am I right in assuming that mpi-test-suite [1] suitable for testing
> openmpi ?
> - what are possible causes for these type of errors ?
> - what would you recommend how to debug these issues ?
>
> Kind regards,
>Alois
>
>
> [1] https://github.com/open-mpi/mpi-test-suite/t
>
>

Re: [OMPI users] Help diagnosing MPI+OpenMP application segmentation fault only when run with --bind-to none

2022-04-22 Thread Gilles Gouaillardet via users

You can first double check you
MPI_Init_thread(..., MPI_THREAD_MULTIPLE, ...)
And the provided level is MPI_THREAD_MULTIPLE as you requested.

Cheers,

Gilles

On Fri, Apr 22, 2022, 21:45 Angel de Vicente via users <
users@lists.open-mpi.org> wrote:

> Hello,
>
> I'm running out of ideas, and wonder if someone here could have some
> tips on how to debug a segmentation fault I'm having with my
> application [due to the nature of the problem I'm wondering if the
> problem is with OpenMPI itself rather than my app, though at this point
> I'm not leaning strongly either way].
>
> The code is hybrid MPI+OpenMP and I compile it with gcc 10.3.0 and
> OpenMPI 4.1.3.
>
> Usually I was running the code with "mpirun -np X --bind-to none [...]"
> so that the threads created by OpenMP don't get bound to a single core
> and I actually get proper speedup out of OpenMP.
>
> Now, since I introduced some changes to the code this week (though I
> have read the changes carefully a number of times, and I don't see
> anything suspicious), I now get a segmentation fault sometimes, but only
> when I run with "--bind-to none" and only in my workstation. It is not
> always with the same running configuration, but I can see some pattern,
> and the problem shows up only if I run the version compiled with OpenMP
> support and most of the times only when the number of rank*threads goes
> above 4 or so. If I run it with "--bind-to socket" all looks good all
> the time.
>
> If I run it in another server, "--bind-to none" doesn't seem to be any
> issue (I submitted the jobs many many times and not a single
> segmentation fault), but in my workstation it fails almost every time if
> using MPI+OpenMP with a handful of threads and with "--bind-to none". In
> this other server I'm running gcc 9.3.0 and OpenMPI 4.1.3.
>
> For example, setting OMP_NUM_THREADS to 1, I run the code like the
> following, and get the segmentation fault as below:
>
> ,
> | angelv@sieladon:~/.../Fe13_NL3/t~gauss+isat+istim$ mpirun -np 4
> --bind-to none  ../../../../../pcorona+openmp~gauss Fe13_NL3.params
> |  Reading control file: Fe13_NL3.params
> |   ... Control file parameters broadcasted
> |
> | [...]
> |
> |  Starting calculation loop on the line of sight
> |  Receiving results from:2
> |  Receiving results from:1
> |
> | Program received signal SIGSEGV: Segmentation fault - invalid memory
> reference.
> |
> | Backtrace for this error:
> |  Receiving results from:3
> | #0  0x7fd747e7555f in ???
> | #1  0x7fd7488778e1 in ???
> | #2  0x7fd7488667a4 in ???
> | #3  0x7fd7486fe84c in ???
> | #4  0x7fd7489aa9ce in ???
> | #5  0x414959 in __pcorona_main_MOD_main_loop._omp_fn.0
> | at src/pcorona_main.f90:627
> | #6  0x7fd74813ec75 in ???
> | #7  0x412bb0 in pcorona
> | at src/pcorona.f90:49
> | #8  0x40361c in main
> | at src/pcorona.f90:17
> |
> | [...]
> |
> |
> --
> | mpirun noticed that process rank 3 with PID 0 on node sieladon exited on
> signal 11 (Segmentation fault).
> | ---
> `
>
> I cannot see inside the MPI library (I don't really know if that would
> be helpful) but line 627 in pcorona_main.f90 is:
>
> ,
> |  call
> mpi_probe(master,mpi_any_tag,mpi_comm_world,stat,mpierror)
> `
>
> Any ideas/suggestions what could be going on or how to try an get some
> more clues about the possible causes of this?
>
> Many thanks,
> --
> Ángel de Vicente
>
> Tel.: +34 922 605 747
> Web.: http://research.iac.es/proyecto/polmag/
>
> -
> AVISO LEGAL: Este mensaje puede contener información confidencial y/o
> privilegiada. Si usted no es el destinatario final del mismo o lo ha
> recibido por error, por favor notifíquelo al remitente inmediatamente.
> Cualquier uso no autorizadas del contenido de este mensaje está
> estrictamente prohibida. Más información en:
> https://www.iac.es/es/responsabilidad-legal
> DISCLAIMER: This message may contain confidential and / or privileged
> information. If you are not the final recipient or have received it in
> error, please notify the sender immediately. Any unauthorized use of the
> content of this message is strictly prohibited. More information:
> https://www.iac.es/en/disclaimer
>

Re: [OMPI users] help with M1 chip macOS openMPI installation

2022-04-21 Thread Gilles Gouaillardet via users

Cici,

I do not think the Intel C compiler is able to generate native code for the
M1 (aarch64).
The best case scenario is it would generate code for x86_64 and then
Rosetta would be used to translate it to aarch64 code,
and this is a very downgraded solution.

So if you really want to stick to the Intel compiler, I strongly encourage
you to run on Intel/AMD processors.
Otherwise, use a native compiler for aarch64, and in this case, brew is not
a bad option.


Cheers,

Gilles

On Thu, Apr 21, 2022 at 6:36 PM Cici Feng via users <
users@lists.open-mpi.org> wrote:

> Hi there,
>
> I am trying to install an electromagnetic inversion software (MARE2DEM) of
> which the intel C compilers and open-MPI are considered as the
> prerequisite. However, since I am completely new to computer science and
> coding, together with some of the technical issues of the computer I am
> building all this on, I have encountered some questions with the whole
> process.
>
> The computer I am working on is a macbook pro with a M1 Max chip. Despite
> how my friends have discouraged me to keep working on my M1 laptop, I still
> want to reach out to the developers since I feel like you guys might have a
> solution.
>
> By downloading the source code of openMPI on the .org website and "sudo
> configure and make all install", I was not able to install the openMPI onto
> my computer. The error provided mentioned something about the chip is not
> supported or somewhat.
>
> I have also tried to install openMPI through homebrew using the command
> "brew install openmpi" and it worked just fine. However, since Homebrew has
> automatically set up the configuration of openMPI (it uses gcc and
> gfortran), I was not able to use my intel compilers to build openMPI which
> causes further problems in the installation of my inversion software.
>
> In conclusion, I think right now the M1 chip is the biggest problem of the
> whole installation process yet I think you guys might have some solution
> for the installation. I would assume that Apple is switching all of its
> chip to M1 which makes the shifts and changes inevitable.
>
> I would really like to hear from you with the solution of installing
> openMPI on a M1-chip macbook and I would like to thank for your time to
> read my prolong email.
>
> Thank you very much.
> Sincerely,
>
> Cici
>
>
>
>
>
>

Re: [OMPI users] Is there a MPI routine that returns the value of "npernode" being used?

2022-04-02 Thread Gilles Gouaillardet via users

Ernesto,

Not directly.

But you can use MPI_Comm_split_type(..., MPI_COMM_TYPE_SHARED, ...) and then
MPI_Comm_size(...) on the "returned" communicator.

Cheers,

Gilles

On Sun, Apr 3, 2022 at 5:52 AM Ernesto Prudencio via users <
users@lists.open-mpi.org> wrote:

> Thanks,
>
>
>
> Ernesto.
>
> Schlumberger-Private
>

Re: [OMPI users] 101 question on MPI_Bcast()

2022-04-02 Thread Gilles Gouaillardet via users

Ernesto,

MPI_Bcast() has no barrier semantic.
It means the root rank can return after the message is sent (kind of eager
send) and before it is received by other ranks.


Cheers,

Gilles

On Sat, Apr 2, 2022, 09:33 Ernesto Prudencio via users <
users@lists.open-mpi.org> wrote:

> I have an “extreme” case below, for the sake of example.
>
>
>
> Suppose one is running a MPI job with N >= 2 ranks, and at a certain
> moment the code does the following:
>
>
>
> .
>
> .
>
> .
>
> If (rank == 0) {
>
> MPI_Bcast(…);
>
> }
>
> .
>
> .
>
> .
>
> std::cout << “Here A, rank = “ << rank << std::endl;
>
> MPI_Barrier(…);
>
> std::cout << “Here B, rank = “ << rank << std::endl;
>
> .
>
> .
>
> .
>
>
>
> I thought rank 0 would never print the message “Here A”, because he MPI
> lib at rank 0 would be stuck on the MPI_Bcast waiting for all other ranks
> to notify (internally, in the MPI lib logic) that they have received the
> contents.
>
>
>
> But this seems not to be the case. Instead, the code behaves as follows:
>
>1. MPI_Bcast() returns the processing to rank 0, so it (rank 0) prints
>the “Here A” message (and all the other ranks print “Here A” as well).
>2. All ranks get to the barrier, and then all of them print the “Here
>B” message afterwards.
>
>
>
> Am I correct on the statements (1) and (2) above?
>
>
>
> Thanks,
>
>
>
> Ernesto.
>
> Schlumberger-Private
>

Re: [OMPI users] Need help for troubleshooting OpenMPI performances

2022-03-24 Thread Gilles Gouaillardet via users

Patrick,

In the worst case scenario, requiring MPI_THREAD_MULTIPLE support can
disable some fast interconnect
and make your app fallback on  IPoIB or similar. And in that case, Open MPI
might prefer a suboptimal
IP network which can impact the overall performances even more.

Which threading support does your app ask?
Many applications do not call MPI in the OpenMP regions at all, or only the
master thread invokes MPI,
and in this case, MPI_THREAD_FUNNELED is enough.

Are you using UCX or the legacy openib btl?
If the former, is it built with multi threading support?
If the latter, I suggest you give UCX - built with multi threading support
- a try and see how it goes


Cheers,

Gilles

On Thu, Mar 24, 2022 at 5:43 PM Patrick Begou via users <
users@lists.open-mpi.org> wrote:

> Le 28/02/2022 à 17:56, Patrick Begou via users a écrit :
> > Hi,
> >
> > I meet a performance problem with OpenMPI on my cluster. In some
> > situation my parallel code is really slow (same binary running on a
> > different mesh).
> >
> > To investigate, the fortran code code is built with profiling option
> > (mpifort -p -O3.) and launched on 91 cores.
> >
> > One mon.out file per process, they show a maximum cpu time of 20.4
> > seconds for each processes (32.7 seconds on my old cluster) and this
> > is Ok.
> >
> > But running on my new cluster requires near 3mn instead of 1mn on the
> > old cluster (elapsed time).
> >
> > New cluster is running OpenMPI 4.05 with HDR-100 connections.
> >
> > Old cluster is running OpenMPI 3.1 with QDR connections.
> >
> > Running Osu Collectives tests on 91 cores shows good latency values on
> > 91 cores and the point-to-points between nodes is correct.
> >
> > How can I investigate this problem as it seams related to MPI
> > communications in some situations that I can reproduce? Using Scalasca
> > ? Other tools ? OpenMPI is not built with special profiling options.
> >
> > Thanks
> >
> > Patrick
> >
> >
> Just to provide an answer to this old thread, the problem has been found
> (but not solved). The application was rebuilt with OpenMP flag (hybrid
> parallelism is implemented with MPI and OpenMP). Setting this flag, even
> if we only use one thread and MPI only parallelism, change OpenMPI
> initialisation from MPI_INIT to MPI_INIT_THREAD in our code and this
> create the big slowdown of the application.
>
> We have temporally removed the OpenMP flag to build the application.
>
> Patrick
>
>
>

Re: [OMPI users] [Ext] Re: Call to MPI_Allreduce() returning value 15

2022-03-14 Thread Gilles Gouaillardet via users

in order to exclude the coll/tuned component:

mpirun --mca coll ^tuned ...


Cheers,

Gilles

On Mon, Mar 14, 2022 at 5:37 PM Ernesto Prudencio via users <
users@lists.open-mpi.org> wrote:

> Thanks for the hint on “mpirun ldd”. I will try it. The problem is that I
> am running on the cloud and it is trickier to get into a node at run time,
> or save information to be retrieved later.
>
>
>
> Sorry for my ignorance on mca stuff, but what would exactly be the
> suggested mpirun command line options on coll / tuned?
>
>
>
> Cheers,
>
>
>
> Ernesto.
>
>
>
> *From:* users  *On Behalf Of *Gilles
> Gouaillardet via users
> *Sent:* Monday, March 14, 2022 2:22 AM
> *To:* Open MPI Users 
> *Cc:* Gilles Gouaillardet 
> *Subject:* Re: [OMPI users] [Ext] Re: Call to MPI_Allreduce() returning
> value 15
>
>
>
> Ernesto,
>
>
>
> you can
>
> mpirun ldd 
>
>
>
> and double check it uses the library you expect.
>
>
>
>
>
> you might want to try adapting your trick to use Open MPI 4.1.2 with your
> binary built with Open MPI 4.0.3 and see how it goes.
>
> i'd try disabling coll/tuned first though.
>
>
>
>
>
> Keep in mind PETSc might call MPI_Allreduce under the hood with matching
> but different signatures.
>
>
>
>
>
> Cheers,
>
>
>
> Gilles
>
>
>
> On Mon, Mar 14, 2022 at 4:09 PM Ernesto Prudencio via users <
> users@lists.open-mpi.org> wrote:
>
> Thanks, Gilles.
>
>
>
> In the case of the application I am working on, all ranks call MPI with
> the same signature / types of variables.
>
>
>
> I do not think there is a code error anywhere. I think this is “just” a
> configuration error from my part.
>
>
>
> Regarding the idea of changing just one item at a time: that would be the
> next step, but first I would like to check if my suspicion that the
> presence of both “/opt/openmpi_4.0.3” and
> “/appl-third-parties/openmpi-4.1.2” at run time could be an issue:
>
>- It is an issue on situation 2, when I explicitly point the runtime
>mpi to be 4.1.2 (also used in compilation)
>    - It is not an issue on situation 3, when I explicitly point the
>runtime mpi to be 4.0.3 compiled with INTEL (even though I compiled the
>application and openmpi 4.1.2 with GNU, and I link the application with
>openmpi 4.1.2)
>
>
>
> Best,
>
>
>
> Ernesto.
>
>
>
> *From:* Gilles Gouaillardet 
> *Sent:* Monday, March 14, 2022 1:37 AM
> *To:* Open MPI Users 
> *Cc:* Ernesto Prudencio 
> *Subject:* Re: [OMPI users] [Ext] Re: Call to MPI_Allreduce() returning
> value 15
>
>
>
> Ernesto,
>
>
>
> the coll/tuned module (that should handle collective subroutines by
> default) has a known issue when matching but non identical signatures are
> used:
>
> for example, one rank uses one vector of n bytes, and an other rank uses n
> bytes.
>
> Is there a chance your application might use this pattern?
>
>
>
> You can give try disabling this component with
>
> mpirun --mca coll ^tuned ...
>
>
>
>
>
> I noted between the successful a) case and the unsuccessful b) case, you
> changed 3 parameters:
>
>  - compiler vendor
>
>  - Open MPI version
>
>  - PETSc 3.10.4
>
> so at this stage, it is not obvious which should be blamed for the failure.
>
>
>
>
>
> In order to get a better picture, I would first try
>
>  - Intel compilers
>
>  - Open MPI 4.1.2
>
>  - PETSc 3.10.4
>
>
>
> => a failure would suggest a regression in Open MPI
>
>
>
> And then
>
>  - Intel compilers
>
>  - Open MPI 4.0.3
>
>  - PETSc 3.16.5
>
>
>
> => a failure would either suggest a regression in PETSc, or PETSc doing
> something different but legit that evidences a bug in Open MPI.
>
>
>
> If you have time, you can also try
>
>  - Intel compilers
>
>  - MPICH (or a derivative such as Intel MPI)
>
>  - PETSc 3.16.5
>
>
>
> => a success would strongly point to Open MPI
>
>
>
>
>
> Cheers,
>
>
>
> Gilles
>
>
>
> On Mon, Mar 14, 2022 at 2:56 PM Ernesto Prudencio via users <
> users@lists.open-mpi.org> wrote:
>
> Forgot to mention that in all 3 situations, mpirun is called as follows
> (35 nodes, 4 MPI ranks per node):
>
>
>
> mpirun -x LD_LIBRARY_PATH=:::… -hostfile /tmp/hostfile.txt
> -np 140 -npernode 4 --mca btl_tcp_if_include eth0 
> 
>
>
>
> So I have a question 3) Should I add some extra option in the mpirun
> command line in order to make situation 2 successful?
>
>
>
&g

Re: [OMPI users] [Ext] Re: Call to MPI_Allreduce() returning value 15

2022-03-14 Thread Gilles Gouaillardet via users

Ernesto,

you can
mpirun ldd 

and double check it uses the library you expect.


you might want to try adapting your trick to use Open MPI 4.1.2 with your
binary built with Open MPI 4.0.3 and see how it goes.
i'd try disabling coll/tuned first though.


Keep in mind PETSc might call MPI_Allreduce under the hood with matching
but different signatures.


Cheers,

Gilles

On Mon, Mar 14, 2022 at 4:09 PM Ernesto Prudencio via users <
users@lists.open-mpi.org> wrote:

> Thanks, Gilles.
>
>
>
> In the case of the application I am working on, all ranks call MPI with
> the same signature / types of variables.
>
>
>
> I do not think there is a code error anywhere. I think this is “just” a
> configuration error from my part.
>
>
>
> Regarding the idea of changing just one item at a time: that would be the
> next step, but first I would like to check if my suspicion that the
> presence of both “/opt/openmpi_4.0.3” and
> “/appl-third-parties/openmpi-4.1.2” at run time could be an issue:
>
>- It is an issue on situation 2, when I explicitly point the runtime
>mpi to be 4.1.2 (also used in compilation)
>- It is not an issue on situation 3, when I explicitly point the
>runtime mpi to be 4.0.3 compiled with INTEL (even though I compiled the
>application and openmpi 4.1.2 with GNU, and I link the application with
>openmpi 4.1.2)
>
>
>
> Best,
>
>
>
> Ernesto.
>
>
>
> *From:* Gilles Gouaillardet 
> *Sent:* Monday, March 14, 2022 1:37 AM
> *To:* Open MPI Users 
> *Cc:* Ernesto Prudencio 
> *Subject:* Re: [OMPI users] [Ext] Re: Call to MPI_Allreduce() returning
> value 15
>
>
>
> Ernesto,
>
>
>
> the coll/tuned module (that should handle collective subroutines by
> default) has a known issue when matching but non identical signatures are
> used:
>
> for example, one rank uses one vector of n bytes, and an other rank uses n
> bytes.
>
> Is there a chance your application might use this pattern?
>
>
>
> You can give try disabling this component with
>
> mpirun --mca coll ^tuned ...
>
>
>
>
>
> I noted between the successful a) case and the unsuccessful b) case, you
> changed 3 parameters:
>
>  - compiler vendor
>
>  - Open MPI version
>
>  - PETSc 3.10.4
>
> so at this stage, it is not obvious which should be blamed for the failure.
>
>
>
>
>
> In order to get a better picture, I would first try
>
>  - Intel compilers
>
>  - Open MPI 4.1.2
>
>  - PETSc 3.10.4
>
>
>
> => a failure would suggest a regression in Open MPI
>
>
>
> And then
>
>  - Intel compilers
>
>  - Open MPI 4.0.3
>
>  - PETSc 3.16.5
>
>
>
> => a failure would either suggest a regression in PETSc, or PETSc doing
> something different but legit that evidences a bug in Open MPI.
>
>
>
> If you have time, you can also try
>
>  - Intel compilers
>
>  - MPICH (or a derivative such as Intel MPI)
>
>  - PETSc 3.16.5
>
>
>
> => a success would strongly point to Open MPI
>
>
>
>
>
> Cheers,
>
>
>
> Gilles
>
>
>
> On Mon, Mar 14, 2022 at 2:56 PM Ernesto Prudencio via users <
> users@lists.open-mpi.org> wrote:
>
> Forgot to mention that in all 3 situations, mpirun is called as follows
> (35 nodes, 4 MPI ranks per node):
>
>
>
> mpirun -x LD_LIBRARY_PATH=:::… -hostfile /tmp/hostfile.txt
> -np 140 -npernode 4 --mca btl_tcp_if_include eth0 
> 
>
>
>
> So I have a question 3) Should I add some extra option in the mpirun
> command line in order to make situation 2 successful?
>
>
>
> Thanks,
>
>
>
> Ernesto.
>
>
>
>
>
> Schlumberger-Private
>
>
>
> Schlumberger-Private
>
> *From:* users  *On Behalf Of *Ernesto
> Prudencio via users
> *Sent:* Monday, March 14, 2022 12:39 AM
> *To:* Open MPI Users 
> *Cc:* Ernesto Prudencio 
> *Subject:* Re: [OMPI users] [Ext] Re: Call to MPI_Allreduce() returning
> value 15
>
>
>
> Thank you for the quick answer, George. I wanted to investigate the
> problem further before replying.
>
>
>
> Below I show 3 situations of my C++ (and Fortran) application, which runs
> on top of PETSc, OpenMPI, and MKL. All 3 situations use MKL 2019.0.5
> compiled with INTEL.
>
>
>
> At the end, I have 2 questions.
>
>
>
> Note: all codes are compiled in a certain set of nodes, and the execution
> happens at _*another*_ set of nodes.
>
>
>
> +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> - - - - - - - -

Re: [OMPI users] [Ext] Re: Call to MPI_Allreduce() returning value 15

2022-03-14 Thread Gilles Gouaillardet via users

Ernesto,

the coll/tuned module (that should handle collective subroutines by
default) has a known issue when matching but non identical signatures are
used:
for example, one rank uses one vector of n bytes, and an other rank uses n
bytes.
Is there a chance your application might use this pattern?

You can give try disabling this component with
mpirun --mca coll ^tuned ...


I noted between the successful a) case and the unsuccessful b) case, you
changed 3 parameters:
 - compiler vendor
 - Open MPI version
 - PETSc 3.10.4
so at this stage, it is not obvious which should be blamed for the failure.


In order to get a better picture, I would first try
 - Intel compilers
 - Open MPI 4.1.2
 - PETSc 3.10.4

=> a failure would suggest a regression in Open MPI

And then
 - Intel compilers
 - Open MPI 4.0.3
 - PETSc 3.16.5

=> a failure would either suggest a regression in PETSc, or PETSc doing
something different but legit that evidences a bug in Open MPI.

If you have time, you can also try
 - Intel compilers
 - MPICH (or a derivative such as Intel MPI)
 - PETSc 3.16.5

=> a success would strongly point to Open MPI


Cheers,

Gilles

On Mon, Mar 14, 2022 at 2:56 PM Ernesto Prudencio via users <
users@lists.open-mpi.org> wrote:

> Forgot to mention that in all 3 situations, mpirun is called as follows
> (35 nodes, 4 MPI ranks per node):
>
>
>
> mpirun -x LD_LIBRARY_PATH=:::… -hostfile /tmp/hostfile.txt
> -np 140 -npernode 4 --mca btl_tcp_if_include eth0 
> 
>
>
>
> So I have a question 3) Should I add some extra option in the mpirun
> command line in order to make situation 2 successful?
>
>
>
> Thanks,
>
>
>
> Ernesto.
>
>
>
>
>
> Schlumberger-Private
>
> *From:* users  *On Behalf Of *Ernesto
> Prudencio via users
> *Sent:* Monday, March 14, 2022 12:39 AM
> *To:* Open MPI Users 
> *Cc:* Ernesto Prudencio 
> *Subject:* Re: [OMPI users] [Ext] Re: Call to MPI_Allreduce() returning
> value 15
>
>
>
> Thank you for the quick answer, George. I wanted to investigate the
> problem further before replying.
>
>
>
> Below I show 3 situations of my C++ (and Fortran) application, which runs
> on top of PETSc, OpenMPI, and MKL. All 3 situations use MKL 2019.0.5
> compiled with INTEL.
>
>
>
> At the end, I have 2 questions.
>
>
>
> Note: all codes are compiled in a certain set of nodes, and the execution
> happens at _*another*_ set of nodes.
>
>
>
> +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> - - - - - - - - -
>
>
>
> Situation 1) It has been successful for months now:
>
>
>
> a) Use INTEL compilers for OpenMPI 4.0.3, PETSc 3.10.4 , and application.
> The configuration options for OpenMPI are:
>
>
>
> '--with-flux-pmi=no' '--enable-orterun-prefix-by-default'
> '--prefix=/mnt/disks/intel-2018-3-222-blade-runtime-env-2018-1-07-08-2018-132838/openmpi_4.0.3_intel2019.5_gcc7.3.1'
> 'FC=ifort' 'CC=gcc'
>
>
>
> b) At run time, each MPI rank prints this info:
>
>
>
> PATH =
> /opt/openmpi_4.0.3/bin:/opt/openmpi_4.0.3/bin:/opt/openmpi_4.0.3/bin:/opt/rh/devtoolset-7/root/usr/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
>
>
>
> LD_LIBRARY_PATH  =
> /opt/openmpi_4.0.3/lib::/opt/rh/devtoolset-7/root/usr/lib64:/opt/rh/devtoolset-7/root/usr/lib/gcc/x86_64-redhat-linux/7:/opt/petsc/lib:/opt/2019.5/compilers_and_libraries/linux/mkl/lib/intel64:/opt/openmpi_4.0.3/lib:/lib64:/lib:/usr/lib64:/usr/lib
>
>
>
> MPI version (compile time)   = 4.0.3
>
> MPI_Get_library_version()= Open MPI v4.0.3, package: Open MPI 
> root@
> Distribution, ident: 4.0.3, repo rev: v4.0.3, Mar 03, 2020
>
> PETSc version (compile time) = 3.10.4
>
>
>
> c) A test of 20 minutes with 14 nodes, 4 MPI ranks per node, runs ok.
>
>
>
> d) A test of 2 hours with 35 nodes, 4 MPI ranks per node, runs ok.
>
>
>
> +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> - - - - - - - - -
>
>
>
> Situation 2) This situation is the one failing during execution.
>
>
>
> a) Use GNU compilers for OpenMPI 4.1.2, PETSc 3.16.5 , and application.
> The configuration options for OpenMPI are:
>
>
>
> '--with-flux-pmi=no' '--prefix=/appl-third-parties/openmpi-4.1.2'
> '--enable-orterun-prefix-by-default'
>
>
>
> b) At run time, each MPI rank prints this info:
>
>
>
> PATH  = /appl-third-parties/openmpi-4.1.2/bin
> :/appl-third-parties/openmpi-4.1.2/bin:/appl-third-parties/openmpi-4.1.2/bin:/opt/rh/devtoolset-7/root/usr/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
>
>
>
> LD_LIBRARY_PATH = /appl-third-parties/openmpi-4.1.2/lib
> ::/opt/rh/devtoolset-7/root/usr/lib64:/opt/rh/devtoolset-7/root/usr/lib/gcc/x86_64-redhat-linux/7:/appl-third-parties/petsc-3.16.5/lib
>
>
> :/opt/2019.5/compilers_and_libraries/linux/mkl/lib/intel64:/appl-third-parties/openmpi-4.1.2/lib:/lib64:/lib:/usr/lib64:/usr/lib
>
>
>
> MPI version (compile time)= 4.1.2
>
>

Re: [OMPI users] Trouble compiling OpenMPI with Infiniband support

2022-02-17 Thread Gilles Gouaillardet via users

Angel,

Infiniband detection likely fails before checking expanded verbs.
Please compress and post the full configure output


Cheers,

Gilles

On Fri, Feb 18, 2022 at 12:02 AM Angel de Vicente via users <
users@lists.open-mpi.org> wrote:

> Hi,
>
> I'm trying to compile the latest OpenMPI version with Infiniband support
> in our local cluster, but didn't get very far (since I'm installing this
> via Spack, I also asked in their support group).
>
> I'm doing the installation via Spack, which is issuing the following
> .configure step (see the options given for --with-knem, --with-hcoll and
> --with-mxm):
>
> ,
> | configure'
> |
> '--prefix=/storage/projects/can30/angelv/spack/opt/spack/linux-sles12-sandybridge/gcc-9.3.0/openmpi-4.1.1-jsvbusyjgthr2d6oyny5klt62gm6ma2u'
> | '--enable-shared' '--disable-silent-rules' '--disable-builtin-atomics'
> | '--enable-static' '--without-pmi'
> |
> '--with-zlib=/storage/projects/can30/angelv/spack/opt/spack/linux-sles12-sandybridge/gcc-9.3.0/zlib-1.2.11-hrstx5ffrg4f4k3xc2anyxed3mmgdcoz'
> | '--enable-mpi1-compatibility' '--with-knem=/opt/knem-1.1.2.90mlnx2'
> | '--with-hcoll=/opt/mellanox/hcoll' '--without-psm' '--without-ofi'
> | '--without-cma' '--without-ucx' '--without-fca'
> | '--with-mxm=/opt/mellanox/mxm' '--without-verbs' '--without-xpmem'
> | '--without-psm2' '--without-alps' '--without-lsf' '--without-sge'
> | '--without-slurm' '--without-tm' '--without-loadleveler'
> | '--disable-memchecker'
> |
> '--with-libevent=/storage/projects/can30/angelv/spack/opt/spack/linux-sles12-sandybridge/gcc-9.3.0/libevent-2.1.12-yd5l4tjmnigv6dqlv5afpn4zc6ekdchc'
> |
> '--with-hwloc=/storage/projects/can30/angelv/spack/opt/spack/linux-sles12-sandybridge/gcc-9.3.0/hwloc-2.6.0-bfnt4g3givflydpe5d2iglyupgbzxbfn'
> | '--disable-java' '--disable-mpi-java' '--without-cuda'
> | '--enable-wrapper-rpath' '--disable-wrapper-runpath' '--disable-mpi-cxx'
> | '--disable-cxx-exceptions'
> |
> '--with-wrapper-ldflags=-Wl,-rpath,/storage/projects/can30/angelv/spack/opt/spack/linux-sles12-sandybridge/gcc-7.2.0/gcc-9.3.0-ghr2jekwusoa4zip36xsa3okgp3bylqm/lib/gcc/x86_64-pc-linux-gnu/9.3.0
> |
> -Wl,-rpath,/storage/projects/can30/angelv/spack/opt/spack/linux-sles12-sandybridge/gcc-7.2.0/gcc-9.3.0-ghr2jekwusoa4zip36xsa3okgp3bylqm/lib64'
> `
>
> Later on in the configuration phase I see:
>
> ,
> | --- MCA component btl:openib (m4 configuration macro)
> | checking for MCA component btl:openib compile mode... static
> | checking whether expanded verbs are available... yes
> | checking whether IBV_EXP_ATOMIC_HCA_REPLY_BE is declared... yes
> | checking whether IBV_EXP_QP_CREATE_ATOMIC_BE_REPLY is declared... yes
> | checking whether ibv_exp_create_qp is declared... yes
> | checking whether ibv_exp_query_device is declared... yes
> | checking whether IBV_EXP_QP_INIT_ATTR_ATOMICS_ARG is declared... yes
> | checking for struct ibv_exp_device_attr.ext_atom... yes
> | checking for struct ibv_exp_device_attr.exp_atomic_cap... yes
> | checking if MCA component btl:openib can compile... no
> `
>
> This is the first time I try to compile OpenMPI this way, and I get a
> bit confused with what each bit is doing, but it looks like it goes
> through the moves to get the btl:openib built, but then for some reason
> it cannot compile it.
>
> Any suggestions/pointers?
>
> Many thanks,
> --
> Ángel de Vicente
>
> Tel.: +34 922 605 747
> Web.: http://research.iac.es/proyecto/polmag/
>
> -
> AVISO LEGAL: Este mensaje puede contener información confidencial y/o
> privilegiada. Si usted no es el destinatario final del mismo o lo ha
> recibido por error, por favor notifíquelo al remitente inmediatamente.
> Cualquier uso no autorizadas del contenido de este mensaje está
> estrictamente prohibida. Más información en:
> https://www.iac.es/es/responsabilidad-legal
> DISCLAIMER: This message may contain confidential and / or privileged
> information. If you are not the final recipient or have received it in
> error, please notify the sender immediately. Any unauthorized use of the
> content of this message is strictly prohibited. More information:
> https://www.iac.es/en/disclaimer
>

Re: [OMPI users] libmpi_mpifh.so.40 - error

2022-01-30 Thread Gilles Gouaillardet via users

Hari,

What does
ldd solver.exe
(or whatever your clever exe file is called) reports?

Cheers,

Gilles

On Mon, Jan 31, 2022 at 2:09 PM Hari Devaraj via users <
users@lists.open-mpi.org> wrote:

> Hello,
>
> I am trying to run a FEA solver exe file.
> I get this error message:
>
> error while loading shared libraries: libmpi_mpifh.so.40: cannot open
> shared object file: No such file or directory
>
> Could someone please help?
>
> regards,
> lrckt
>

Re: [OMPI users] RES: OpenMPI - Intel MPI

2022-01-27 Thread Gilles Gouaillardet via users

Thanks Ralph,

Now I get what you had in mind.

Strictly speaking, you are making the assumption that Open MPI performance
matches the system MPI performances.

This is generally true for common interconnects and/or those that feature
providers for libfabric or UCX, but not so for "exotic" interconnects (that
might not be supported natively by Open MPI or abstraction layers) and/or
with an uncommon topology (for which collective communications are not
fully optimized by Open MPI). In the latter case, using the system/vendor
MPI is the best option performance wise.

Cheers,

Gilles

On Fri, Jan 28, 2022 at 2:23 AM Ralph Castain via users <
users@lists.open-mpi.org> wrote:

> Just to complete this - there is always a lingering question regarding
> shared memory support. There are two ways to resolve that one:
>
> * run one container per physical node, launching multiple procs in each
> container. The procs can then utilize shared memory _inside_ the container.
> This is the cleanest solution (i.e., minimizes container boundary
> violations), but some users need/want per-process isolation.
>
> * run one container per MPI process, having each container then mount an
> _external_ common directory to an internal mount point. This allows each
> process to access the common shared memory location. As with the device
> drivers, you typically specify that external mount location when launching
> the container.
>
> Using those combined methods, you can certainly have a "generic" container
> that suffers no performance impact from bare metal. The problem has been
> that it takes a certain degree of "container savvy" to set this up and make
> it work - which is beyond what most users really want to learn. I'm sure
> the container community is working on ways to reduce that burden (I'm not
> really plugged into those efforts, but others on this list might be).
>
> Ralph
>
>
> > On Jan 27, 2022, at 7:39 AM, Ralph H Castain  wrote:
> >
> >> Fair enough Ralph! I was implicitly assuming a "build once / run
> everywhere" use case, my bad for not making my assumption clear.
> >> If the container is built to run on a specific host, there are indeed
> other options to achieve near native performances.
> >>
> >
> > Err...that isn't actually what I meant, nor what we did. You can, in
> fact, build a container that can "run everywhere" while still employing
> high-speed fabric support. What you do is:
> >
> > * configure OMPI with all the fabrics enabled (or at least all the ones
> you care about)
> >
> > * don't include the fabric drivers in your container. These can/will
> vary across deployments, especially those (like NVIDIA's) that involve
> kernel modules
> >
> > * setup your container to mount specified external device driver
> locations onto the locations where you configured OMPI to find them. Sadly,
> this does violate the container boundary - but nobody has come up with
> another solution, and at least the violation is confined to just the device
> drivers. Typically, you specify the external locations that are to be
> mounted using an envar or some other mechanism appropriate to your
> container, and then include the relevant information when launching the
> containers.
> >
> > When OMPI initializes, it will do its normal procedure of attempting to
> load each fabric's drivers, selecting the transports whose drivers it can
> load. NOTE: beginning with OMPI v5, you'll need to explicitly tell OMPI to
> build without statically linking in the fabric plugins or else this
> probably will fail.
> >
> > At least one vendor now distributes OMPI containers preconfigured with
> their fabric support based on this method. So using a "generic" container
> doesn't mean you lose performance - in fact, our tests showed zero impact
> on performance using this method.
> >
> > HTH
> > Ralph
> >
>
>
>

Re: [OMPI users] RES: OpenMPI - Intel MPI

2022-01-26 Thread Gilles Gouaillardet via users

Fair enough Ralph!

I was implicitly assuming a "build once / run everywhere" use case, my bad
for not making my assumption clear.

If the container is built to run on a specific host, there are indeed other
options to achieve near native performances.

Cheers,

Gilles

On Thu, Jan 27, 2022 at 4:02 PM Ralph Castain via users <
users@lists.open-mpi.org> wrote:

> I'll disagree a bit there. You do want to use an MPI library in your
> container that is configued to perform on the host cluster. However, that
> doesn't mean you are constrained as Gilles describes. It takes a little
> more setup knowledge, true, but there are lots of instructions and
> knowledgeable people out there to help. Experiments have shown that using
> non-system MPIs provide at least equivalent performance to the native MPIs
> when configured. Matching the internal/external MPI implementations may
> simplify the mechanics of setting it up, but it is definitely not required.
>
>
> On Jan 26, 2022, at 8:55 PM, Gilles Gouaillardet via users <
> users@lists.open-mpi.org> wrote:
>
> Brian,
>
> FWIW
>
> Keep in mind that when running a container on a supercomputer, it is
> generally recommended to use the supercomputer MPI implementation
> (fine tuned and with support for the high speed interconnect) instead of
> the one of the container (generally a vanilla MPI with basic
> support for TCP and shared memory).
> That scenario implies several additional constraints, and one of them is
> the MPI library of the host and the container are (oversimplified) ABI
> compatible.
>
> In your case, you would have to rebuild your container with MPICH (instead
> of Open MPI) so it can be "substituted" at run time with Intel MPI (MPICH
> based and ABI compatible).
>
> Cheers,
>
> Gilles
>
> On Thu, Jan 27, 2022 at 1:07 PM Brian Dobbins via users <
> users@lists.open-mpi.org> wrote:
>
>>
>> Hi Ralph,
>>
>>   Thanks for the explanation - in hindsight, that makes perfect sense,
>> since each process is operating inside the container and will of course
>> load up identical libraries, so data types/sizes can't be inconsistent.  I
>> don't know why I didn't realize that before.  I imagine the past issues I'd
>> experienced were just due to the PMI differences in the different MPI
>> implementations at the time.  I owe you a beer or something at the next
>> in-person SC conference!
>>
>>   Cheers,
>>   - Brian
>>
>>
>> On Wed, Jan 26, 2022 at 4:54 PM Ralph Castain via users <
>> users@lists.open-mpi.org> wrote:
>>
>>> There is indeed an ABI difference. However, the _launcher_ doesn't have
>>> anything to do with the MPI library. All that is needed is a launcher that
>>> can provide the key exchange required to wireup the MPI processes. At this
>>> point, both MPICH and OMPI have PMIx support, so you can use the same
>>> launcher for both. IMPI does not, and so the IMPI launcher will only
>>> support PMI-1 or PMI-2 (I forget which one).
>>>
>>> You can, however, work around that problem. For example, if the host
>>> system is using Slurm, then you could "srun" the containers and let Slurm
>>> perform the wireup. Again, you'd have to ensure that OMPI was built to
>>> support whatever wireup protocol the Slurm installation supported (which
>>> might well be PMIx today). Also works on Cray/ALPS. Completely bypasses the
>>> IMPI issue.
>>>
>>> Another option I've seen used is to have the host system start the
>>> containers (using ssh or whatever), providing the containers with access to
>>> a "hostfile" identifying the TCP address of each container. It is then easy
>>> for OMPI's mpirun to launch the job across the containers. I use this every
>>> day on my machine (using Docker Desktop with Docker containers, but the
>>> container tech is irrelevant here) to test OMPI. Pretty easy to set that
>>> up, and I should think the sys admins could do so for their users.
>>>
>>> Finally, you could always install the PMIx Reference RTE (PRRTE) on the
>>> cluster as that executes at user level, and then use PRRTE to launch your
>>> OMPI containers. OMPI runs very well under PRRTE - in fact, PRRTE is the
>>> RTE embedded in OMPI starting with the v5.0 release.
>>>
>>> Regardless of your choice of method, the presence of IMPI doesn't
>>> preclude using OMPI containers so long as the OMPI library is fully
>>> contained in that container. Choice of launch method just depends on how
>>> your system is setup.
>>>

Re: [OMPI users] RES: OpenMPI - Intel MPI

2022-01-26 Thread Gilles Gouaillardet via users

Brian,

FWIW

Keep in mind that when running a container on a supercomputer, it is
generally recommended to use the supercomputer MPI implementation
(fine tuned and with support for the high speed interconnect) instead of
the one of the container (generally a vanilla MPI with basic
support for TCP and shared memory).
That scenario implies several additional constraints, and one of them is
the MPI library of the host and the container are (oversimplified) ABI
compatible.

In your case, you would have to rebuild your container with MPICH (instead
of Open MPI) so it can be "substituted" at run time with Intel MPI (MPICH
based and ABI compatible).

Cheers,

Gilles

On Thu, Jan 27, 2022 at 1:07 PM Brian Dobbins via users <
users@lists.open-mpi.org> wrote:

>
> Hi Ralph,
>
>   Thanks for the explanation - in hindsight, that makes perfect sense,
> since each process is operating inside the container and will of course
> load up identical libraries, so data types/sizes can't be inconsistent.  I
> don't know why I didn't realize that before.  I imagine the past issues I'd
> experienced were just due to the PMI differences in the different MPI
> implementations at the time.  I owe you a beer or something at the next
> in-person SC conference!
>
>   Cheers,
>   - Brian
>
>
> On Wed, Jan 26, 2022 at 4:54 PM Ralph Castain via users <
> users@lists.open-mpi.org> wrote:
>
>> There is indeed an ABI difference. However, the _launcher_ doesn't have
>> anything to do with the MPI library. All that is needed is a launcher that
>> can provide the key exchange required to wireup the MPI processes. At this
>> point, both MPICH and OMPI have PMIx support, so you can use the same
>> launcher for both. IMPI does not, and so the IMPI launcher will only
>> support PMI-1 or PMI-2 (I forget which one).
>>
>> You can, however, work around that problem. For example, if the host
>> system is using Slurm, then you could "srun" the containers and let Slurm
>> perform the wireup. Again, you'd have to ensure that OMPI was built to
>> support whatever wireup protocol the Slurm installation supported (which
>> might well be PMIx today). Also works on Cray/ALPS. Completely bypasses the
>> IMPI issue.
>>
>> Another option I've seen used is to have the host system start the
>> containers (using ssh or whatever), providing the containers with access to
>> a "hostfile" identifying the TCP address of each container. It is then easy
>> for OMPI's mpirun to launch the job across the containers. I use this every
>> day on my machine (using Docker Desktop with Docker containers, but the
>> container tech is irrelevant here) to test OMPI. Pretty easy to set that
>> up, and I should think the sys admins could do so for their users.
>>
>> Finally, you could always install the PMIx Reference RTE (PRRTE) on the
>> cluster as that executes at user level, and then use PRRTE to launch your
>> OMPI containers. OMPI runs very well under PRRTE - in fact, PRRTE is the
>> RTE embedded in OMPI starting with the v5.0 release.
>>
>> Regardless of your choice of method, the presence of IMPI doesn't
>> preclude using OMPI containers so long as the OMPI library is fully
>> contained in that container. Choice of launch method just depends on how
>> your system is setup.
>>
>> Ralph
>>
>>
>> On Jan 26, 2022, at 3:17 PM, Brian Dobbins  wrote:
>>
>>
>> Hi Ralph,
>>
>> Afraid I don't understand. If your image has the OMPI libraries installed
>>> in it, what difference does it make what is on your host? You'll never see
>>> the IMPI installation.
>>>
>>
>>> We have been supporting people running that way since Singularity was
>>> originally released, without any problems. The only time you can hit an
>>> issue is if you try to mount the MPI libraries from the host (i.e., violate
>>> the container boundary) - so don't do that and you should be fine.
>>>
>>
>>   Can you clarify what you mean here?  I thought there was an ABI
>> difference between the various MPICH-based MPIs and OpenMPI, meaning you
>> can't use a host's Intel MPI to launch a container's OpenMPI-compiled
>> program.  You *can* use the internal-to-the-container OpenMPI to launch
>> everything, which is easy for single-node runs but more challenging for
>> multi-node ones.  Maybe my understanding is wrong or out of date though?
>>
>>   Thanks,
>>   - Brian
>>
>>
>>
>>>
>>>
>>> On Jan 26, 2022, at 12:19 PM, Luis Alfredo Pires Barbosa <
>>> luis_pire...@hotmail.com> wrote:
>>>
>>> Hi Ralph,
>>>
>>> My singularity image has OpenMPI, but my host doesnt (Intel MPI). And I
>>> am not sure if I the system would work with Intel + OpenMPI.
>>>
>>> Luis
>>>
>>> Enviado do Email 
>>> para Windows
>>>
>>> *De: *Ralph Castain via users 
>>> *Enviado:*quarta-feira, 26 de janeiro de 2022 16:01
>>> *Para: *Open MPI Users 
>>> *Cc:*Ralph Castain 
>>> *Assunto: *Re: [OMPI users] OpenMPI - Intel MPI
>>>
>>> Err...the whole point of a container is to put all the library
>>> dependencies

Re: [OMPI users] Creating An MPI Job from Procs Launched by a Different Launcher

2022-01-25 Thread Gilles Gouaillardet via users

You need a way for your process to exchange information so MPI_Init() 
can work.

 

One option is to have your custom launcher implement a PMIx server

https://pmix.github.io

If you choose this path, you will likely want to use the Open PMIx 
reference implementation

https://openpmix.github.io

so your custom launcher would "only" have to implement a limited number 
of callbacks.

 

Cheers,

 

Gilles

 

- Original Message -


Any pointers?
 


On Tue, Jan 25, 2022 at 12:55 PM Ralph Castain via users  wrote:


Short answer is yes, but it it a bit complicated to do.
 

On Jan 25, 2022, at 12:28 PM, Saliya Ekanayake via users  wrote:
 


Hi,
 

I am trying to run an MPI program on a platform that launches the 
processes using a custom launcher (not mpiexec). This will end up 
spawning N processes of the program, but I am not sure if MPI_Init() 
would work or not in this case?

 

Is it possible to have a group of processes launched by some other means 
to be tied into an MPI communicator?

 

Thank you,

Saliyaf
 















 




 



















 

 
--















Saliya Ekanayake, Ph.D

Cloud Accelerated Systems & Technologies (CAST)

Microsoft

Re: [OMPI users] Open MPI + Slurm + lmod

2022-01-25 Thread Gilles Gouaillardet via users

Matthias,

Thanks for the clarifications.

Unfortunately, I cannot connect the dots and I must be missing something.

If I recap correctly:
 - SLURM has builtin PMIx support
 - Open MPI has builtin PMIx support
 - srun explicitly requires PMIx (srun --mpi=pmix_v3 ...)
 - and yet Open MPI issues an error message stating missing support for PMI
(aka SLURM provided PMI1/PMI2)
So it seems Open PMI builtin PMIx client is unable to find/communicate with
SLURM PMIx server

PMIx has cross version compatibility (e.g. client and server can have some
different versions), but with some restrictions
Could this be the root cause?

What is the PMIx library version used by SLURM?


Ralph, do you see something wrong on why Open MPI and SLURM cannot
communicate via PMIx?


Cheers,

Gilles

On Tue, Jan 25, 2022 at 5:47 PM Matthias Leopold <
matthias.leop...@meduniwien.ac.at> wrote:

> Hi Gilles,
>
> I'm indeed using srun, I didn't have luck using mpirun yet.
> Are option 2 + 3 of your list really different things? As far as I
> understood now I need "Open MPI with PMI support", THEN I can use srun
> with PMIx. Right now using "srun --mpi=pmix(_v3)" gives the error
> mentioned below.
>
> Best,
> Matthias
>
> Am 25.01.22 um 07:17 schrieb Gilles Gouaillardet via users:
> > Matthias,
> >
> > do you run the MPI application with mpirun or srun?
> >
> > The error log suggests you are using srun, and SLURM only provides only
> > PMI support.
> > If this is the case, then you have three options:
> >   - use mpirun
> >   - rebuild Open MPI with PMI support as Ralph previously explained
> >   - use SLURM PMIx:
> >  srun --mpi=list
> >  will list the PMI flavors provided by SLURM
> > a) if PMIx is not supported, contact your sysadmin and ask for it
> > b) if PMIx is supported but is not the default, ask for it, for
> > example with
> > srun --mpi=pmix_v3 ...
> >
> > Cheers,
> >
> > Gilles
> >
> > On Tue, Jan 25, 2022 at 12:30 AM Ralph Castain via users
> > mailto:users@lists.open-mpi.org>> wrote:
> >
> > You should probably ask them - I see in the top one that they used a
> > platform file, which likely had the missing option in it. The bottom
> > one does not use that platform file, so it was probably missed.
> >
> >
> >  > On Jan 24, 2022, at 7:17 AM, Matthias Leopold via users
> > mailto:users@lists.open-mpi.org>> wrote:
> >  >
> >  > To be sure: both packages were provided by NVIDIA (I didn't
> > compile them)
> >  >
> >  > Am 24.01.22 um 16:13 schrieb Matthias Leopold:
> >  >> Thx, but I don't see this option in any of the two versions:
> >  >> /usr/mpi/gcc/openmpi-4.1.2a1/bin/ompi_info (works with slurm):
> >  >>   Configure command line: '--build=x86_64-linux-gnu'
> > '--prefix=/usr' '--includedir=${prefix}/include'
> > '--mandir=${prefix}/share/man' '--infodir=${prefix}/share/info'
> > '--sysconfdir=/etc' '--localstatedir=/var' '--disable-silent-rules'
> > '--libexecdir=${prefix}/lib/openmpi' '--disable-maintainer-mode'
> > '--disable-dependency-tracking'
> > '--prefix=/usr/mpi/gcc/openmpi-4.1.2a1'
> > '--with-platform=contrib/platform/mellanox/optimized'
> >  >> lmod ompi (doesn't work with slurm)
> >  >>   Configure command line:
> >
>  '--prefix=/proj/nv/libraries/Linux_x86_64/dev/openmpi4/205295-dev-clean-1'
> > 'CC=nvc -nomp' 'CXX=nvc++ -nomp' 'FC=nvfortran -nomp' 'CFLAGS=-O1
> > -fPIC -c99 -tp p7-64' 'CXXFLAGS=-O1 -fPIC -tp p7-64' 'FCFLAGS=-O1
> > -fPIC -tp p7-64' 'LD=ld' '--enable-shared' '--enable-static'
> > '--without-tm' '--enable-mpi-cxx' '--disable-wrapper-runpath'
> > '--enable-mpirun-prefix-by-default' '--with-libevent=internal'
> > '--with-slurm' '--without-libnl' '--enable-mpi1-compatibility'
> > '--enable-mca-no-build=btl-uct' '--without-verbs'
> > '--with-cuda=/proj/cuda/11.0/Linux_x86_64'
> >
>  '--with-ucx=/proj/nv/libraries/Linux_x86_64/dev/openmpi4/205295-dev-clean-1'
> > Matthias
> >  >> Am 24.01.22 um 15:59 schrieb Ralph Castain via users:
> >  >>> If you look at your configure line, you forgot to include
> > --with-pmi=. We don't build the Slurm PMI
> > support by default due to the GPL licensing issues - you have to
> > point at it.
> >  >>>
> >  >>>
> >  >>>> On Jan 24, 2022, at 6:41 AM, Matthias Leopold via users
> > mailto:users@lists.open-mpi.org

Re: [OMPI users] Open MPI + Slurm + lmod

2022-01-24 Thread Gilles Gouaillardet via users

Matthias,

do you run the MPI application with mpirun or srun?

The error log suggests you are using srun, and SLURM only provides only PMI
support.
If this is the case, then you have three options:
 - use mpirun
 - rebuild Open MPI with PMI support as Ralph previously explained
 - use SLURM PMIx:
srun --mpi=list
will list the PMI flavors provided by SLURM
   a) if PMIx is not supported, contact your sysadmin and ask for it
   b) if PMIx is supported but is not the default, ask for it, for example
with
   srun --mpi=pmix_v3 ...

Cheers,

Gilles

On Tue, Jan 25, 2022 at 12:30 AM Ralph Castain via users <
users@lists.open-mpi.org> wrote:

> You should probably ask them - I see in the top one that they used a
> platform file, which likely had the missing option in it. The bottom one
> does not use that platform file, so it was probably missed.
>
>
> > On Jan 24, 2022, at 7:17 AM, Matthias Leopold via users <
> users@lists.open-mpi.org> wrote:
> >
> > To be sure: both packages were provided by NVIDIA (I didn't compile them)
> >
> > Am 24.01.22 um 16:13 schrieb Matthias Leopold:
> >> Thx, but I don't see this option in any of the two versions:
> >> /usr/mpi/gcc/openmpi-4.1.2a1/bin/ompi_info (works with slurm):
> >>   Configure command line: '--build=x86_64-linux-gnu' '--prefix=/usr'
> '--includedir=${prefix}/include' '--mandir=${prefix}/share/man'
> '--infodir=${prefix}/share/info' '--sysconfdir=/etc' '--localstatedir=/var'
> '--disable-silent-rules' '--libexecdir=${prefix}/lib/openmpi'
> '--disable-maintainer-mode' '--disable-dependency-tracking'
> '--prefix=/usr/mpi/gcc/openmpi-4.1.2a1'
> '--with-platform=contrib/platform/mellanox/optimized'
> >> lmod ompi (doesn't work with slurm)
> >>   Configure command line:
> '--prefix=/proj/nv/libraries/Linux_x86_64/dev/openmpi4/205295-dev-clean-1'
> 'CC=nvc -nomp' 'CXX=nvc++ -nomp' 'FC=nvfortran -nomp' 'CFLAGS=-O1 -fPIC
> -c99 -tp p7-64' 'CXXFLAGS=-O1 -fPIC -tp p7-64' 'FCFLAGS=-O1 -fPIC -tp
> p7-64' 'LD=ld' '--enable-shared' '--enable-static' '--without-tm'
> '--enable-mpi-cxx' '--disable-wrapper-runpath'
> '--enable-mpirun-prefix-by-default' '--with-libevent=internal'
> '--with-slurm' '--without-libnl' '--enable-mpi1-compatibility'
> '--enable-mca-no-build=btl-uct' '--without-verbs'
> '--with-cuda=/proj/cuda/11.0/Linux_x86_64'
> '--with-ucx=/proj/nv/libraries/Linux_x86_64/dev/openmpi4/205295-dev-clean-1'
> Matthias
> >> Am 24.01.22 um 15:59 schrieb Ralph Castain via users:
> >>> If you look at your configure line, you forgot to include
> --with-pmi=. We don't build the Slurm PMI support by
> default due to the GPL licensing issues - you have to point at it.
> >>>
> >>>
>  On Jan 24, 2022, at 6:41 AM, Matthias Leopold via users <
> users@lists.open-mpi.org> wrote:
> 
>  Hi,
> 
>  we have 2 DGX A100 machines and I'm trying to run nccl-tests (
> https://github.com/NVIDIA/nccl-tests) in various ways to understand how
> things work.
> 
>  I can successfully run nccl-tests on both nodes with Slurm (via srun)
> when built directly on a compute node against Open MPI 4.1.2 coming from a
> NVIDIA deb package.
> 
>  I can also build nccl-tests in a lmod environment with NVIDIA HPC SDK
> 21.09 with Open MPI 4.0.5. When I run this with Slurm (via srun) I get the
> following message:
> 
>  [foo:1140698] OPAL ERROR: Error in file
> ../../../../../opal/mca/pmix/pmix3x/pmix3x_client.c at line 112
> 
> 
> --
> 
>  The application appears to have been direct launched using "srun",
> 
>  but OMPI was not built with SLURM's PMI support and therefore cannot
> 
>  execute. There are several options for building PMI support under
> 
>  SLURM, depending upon the SLURM version you are using:
> 
> 
> 
>    version 16.05 or later: you can use SLURM's PMIx support. This
> 
>    requires that you configure and build SLURM --with-pmix.
> 
> 
> 
>    Versions earlier than 16.05: you must use either SLURM's PMI-1 or
> 
>    PMI-2 support. SLURM builds PMI-1 by default, or you can manually
> 
>    install PMI-2. You must then build Open MPI using --with-pmi
> pointing
> 
>    to the SLURM PMI library location.
> 
> 
> 
>  Please configure as appropriate and try again.
> 
> 
> --
> 
>  *** An error occurred in MPI_Init
> 
>  *** on a NULL communicator
> 
>  *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now
> abort,
> 
>  ***and potentially your MPI job)
> 
> 
> 
>  When I look at PMI support in both Open MPI packages I don't see a
> lot of difference:
> 
>  “/usr/mpi/gcc/openmpi-4.1.2a1/bin/ompi_info --parsable | grep -i pmi”:
> 
>  mca:pmix:isolated:version:“mca:2.1.0”
>

Re: [OMPI users] unexpected behavior when combining MPI_Gather and MPI_Type_vector

2021-12-16 Thread Gilles Gouaillardet via users

Jonas,

In case I misunderstood your question and you want to print

v_glob on P0: 9x2

0 9

1 10

2 11

3 12

4 13

5 14

6 15

7 16

8 17

then you have to fix the print invocation

// note: print an additional column to show the displacement error we get:

if (!rank) print("v_glob", rank, n, m, v_glob);

and also resize rtype so the second element starts at v_glob[3][0] => upper
bound = (3*sizeof(int))

By the way, since this question is not Open MPI specific, sites such as
Stack Overflow are a better fit.


Cheers,

Gilles
On Thu, Dec 16, 2021 at 6:46 PM Gilles Gouaillardet <
gilles.gouaillar...@gmail.com> wrote:

> Jonas,
>
> Assuming v_glob is what you expect, you will need to
> `MPI_Type_create_resized_type()` the received type so the block received
> from process 1 will be placed at the right position (v_glob[3][1] => upper
> bound = ((4*3+1) * sizeof(int))
>
> Cheers,
>
> Gilles
>
> On Thu, Dec 16, 2021 at 6:33 PM Jonas Thies via users <
> users@lists.open-mpi.org> wrote:
>
>> Dear OpenMPI community,
>>
>> Here's a little puzzle for the Christmas holidays (although I would
>> really appreciate a quick solution!).
>>
>> I'm stuck with the following relatively basic problem: given a local nloc
>> x m matrix X_p in column-major ordering on each MPI process p, perform a
>> single MPI_Gather operation to construct the matrix
>> X_0
>> X_1
>> ...
>>
>> X_nproc
>>
>> again, in col-major ordering. My approach is to use MPI_Type_vector to
>> define an stype and an rtype, where stype has stride nloc, and rtype has
>> stride nproc*nloc. The observation is that there is an unexpected
>> displacement of (m-1)*n*p in the result array for the part arriving from
>> process p.
>>
>> The MFE code is attached, and I use OpenMPI 4.0.5 with GCC 11.2 (although
>> other versions and even distributions seem to display the same behavior).
>> Example (nloc=3, nproc=3, m=2, with some additional columns printed for the
>> sake of demonstration):
>>
>>
>> > mpicxx -o matrix_gather matrix_gather.cpp
>> mpirun -np 3 ./matrix_gather
>>
>> v_loc on P0: 3x2
>> 0 9
>> 1 10
>> 2 11
>>
>> v_loc on P1: 3x2
>> 3 12
>> 4 13
>> 5 14
>>
>> v_loc on P2: 3x2
>> 6 15
>> 7 16
>> 8 17
>>
>> v_glob on P0: 9x4
>> 0 9 0 0
>> 1 10 0 0
>> 2 11 0 0
>> 0 3 12 0
>> 0 4 13 0
>> 0 5 14 0
>> 0 0 6 15
>> 0 0 7 16
>> 0 0 8 17
>>
>> Any ideas?
>>
>> Thanks,
>>
>> Jonas
>>
>>
>> --
>> *J. Thies*
>> Assistant Professor
>>
>> TU Delft
>> Faculty Electrical Engineering, Mathematics and Computer Science
>> Institute of Applied Mathematics and High Performance Computing Center
>> Mekelweg 4
>> 2628 CD Delft
>>
>> T +31 15 27 
>> *j.th...@tudelft.nl *
>>
>

Re: [OMPI users] unexpected behavior when combining MPI_Gather and MPI_Type_vector

2021-12-16 Thread Gilles Gouaillardet via users

Jonas,

Assuming v_glob is what you expect, you will need to
`MPI_Type_create_resized_type()` the received type so the block received
from process 1 will be placed at the right position (v_glob[3][1] => upper
bound = ((4*3+1) * sizeof(int))

Cheers,

Gilles

On Thu, Dec 16, 2021 at 6:33 PM Jonas Thies via users <
users@lists.open-mpi.org> wrote:

> Dear OpenMPI community,
>
> Here's a little puzzle for the Christmas holidays (although I would really
> appreciate a quick solution!).
>
> I'm stuck with the following relatively basic problem: given a local nloc
> x m matrix X_p in column-major ordering on each MPI process p, perform a
> single MPI_Gather operation to construct the matrix
> X_0
> X_1
> ...
>
> X_nproc
>
> again, in col-major ordering. My approach is to use MPI_Type_vector to
> define an stype and an rtype, where stype has stride nloc, and rtype has
> stride nproc*nloc. The observation is that there is an unexpected
> displacement of (m-1)*n*p in the result array for the part arriving from
> process p.
>
> The MFE code is attached, and I use OpenMPI 4.0.5 with GCC 11.2 (although
> other versions and even distributions seem to display the same behavior).
> Example (nloc=3, nproc=3, m=2, with some additional columns printed for the
> sake of demonstration):
>
>
> > mpicxx -o matrix_gather matrix_gather.cpp
> mpirun -np 3 ./matrix_gather
>
> v_loc on P0: 3x2
> 0 9
> 1 10
> 2 11
>
> v_loc on P1: 3x2
> 3 12
> 4 13
> 5 14
>
> v_loc on P2: 3x2
> 6 15
> 7 16
> 8 17
>
> v_glob on P0: 9x4
> 0 9 0 0
> 1 10 0 0
> 2 11 0 0
> 0 3 12 0
> 0 4 13 0
> 0 5 14 0
> 0 0 6 15
> 0 0 7 16
> 0 0 8 17
>
> Any ideas?
>
> Thanks,
>
> Jonas
>
>
> --
> *J. Thies*
> Assistant Professor
>
> TU Delft
> Faculty Electrical Engineering, Mathematics and Computer Science
> Institute of Applied Mathematics and High Performance Computing Center
> Mekelweg 4
> 2628 CD Delft
>
> T +31 15 27 
> *j.th...@tudelft.nl *
>

Re: [OMPI users] Reserving slots and filling them after job launch with MPI_Comm_spawn

2021-11-03 Thread Gilles Gouaillardet via users

Kurt,

Assuming you built Open MPI with tm support (default if tm is detected at
configure time, but you can configure --with-tm to have it abort if tm
support is not found), you should not need to use a hostfile.

As a workaround, I would suggest you try to
mpirun --map-by node -np 21 ...


Cheers,

Gilles

On Wed, Nov 3, 2021 at 6:06 PM Mccall, Kurt E. (MSFC-EV41) via users <
users@lists.open-mpi.org> wrote:

> I’m using OpenMPI 4.1.1 compiled with Nvidia’s nvc++ 20.9, and compiled
> with Torque support.
>
>
>
> I want to reserve multiple slots on each node, and then launch a single
> manager process on each node.   The remaining slots would be filled up as
> the manager spawns new processes with MPI_Comm_spawn on its local node.
>
>
>
> Here is the abbreviated mpiexec command, which I assume is the source of
> the problem described below (?).   The hostfile was created by Torque and
> it contains many repeated node names, one for each slot that it reserved.
>
>
>
> $ mpiexec --hostfile  MyHostFile  -np 21 -npernode 1  (etc.)
>
>
>
>
>
> When MPI_Comm_spawn is called, MPI is reporting that “All nodes which are
> allocated for this job are already filled."   They don’t appear to be
> filled as it also reports that only one slot is in use for each node:
>
>
>
> ==   ALLOCATED NODES   ==
>
> n022: flags=0x11 slots=9 max_slots=0 slots_inuse=1 state=UP
>
> n021: flags=0x13 slots=9 max_slots=0 slots_inuse=1 state=UP
>
> n020: flags=0x13 slots=9 max_slots=0 slots_inuse=1 state=UP
>
> n018: flags=0x13 slots=9 max_slots=0 slots_inuse=1 state=UP
>
> n017: flags=0x13 slots=9 max_slots=0 slots_inuse=1 state=UP
>
> n016: flags=0x13 slots=9 max_slots=0 slots_inuse=1 state=UP
>
> n015: flags=0x13 slots=9 max_slots=0 slots_inuse=1 state=UP
>
> n014: flags=0x13 slots=9 max_slots=0 slots_inuse=1 state=UP
>
> n013: flags=0x13 slots=9 max_slots=0 slots_inuse=1 state=UP
>
> n012: flags=0x13 slots=9 max_slots=0 slots_inuse=1 state=UP
>
> n011: flags=0x13 slots=9 max_slots=0 slots_inuse=1 state=UP
>
> n010: flags=0x13 slots=9 max_slots=0 slots_inuse=1 state=UP
>
> n009: flags=0x13 slots=9 max_slots=0 slots_inuse=1 state=UP
>
> n008: flags=0x13 slots=9 max_slots=0 slots_inuse=1 state=UP
>
> n007: flags=0x13 slots=9 max_slots=0 slots_inuse=1 state=UP
>
> n006: flags=0x13 slots=9 max_slots=0 slots_inuse=1 state=UP
>
> n005: flags=0x13 slots=9 max_slots=0 slots_inuse=1 state=UP
>
> n004: flags=0x13 slots=9 max_slots=0 slots_inuse=1 state=UP
>
> n003: flags=0x13 slots=9 max_slots=0 slots_inuse=1 state=UP
>
> n002: flags=0x13 slots=9 max_slots=0 slots_inuse=1 state=UP
>
> n001: flags=0x13 slots=9 max_slots=0 slots_inuse=1 state=UP
>
>
>
> Do you have any idea what I am doing wrong?   My Torque qsub arguments are
> unchanged from when I successfully launched this kind of job structure
> under MPICH.   The relevant argument to qsub is the resource list, which is
> “-l  nodes=21:ppn=9”.
>
>
>

Re: [OMPI users] Newbie Question.

2021-11-01 Thread Gilles Gouaillardet via users

Hi Ben,

have you tried

export OMPI_MCA_common_ucx_opal_mem_hooks=1

Cheers,

Gilles

On Mon, Nov 1, 2021 at 9:22 PM bend linux4ms.net via users <
users@lists.open-mpi.org> wrote:

> Ok, I a am newbie supporting the a HPC project and learning about MPI.
>
> I have the following portion of a shells script:
>
> export OMPI_MCA_btl_openib_allow_ib=1
> export OMPI_MCA_btl_openib_if_include="mlx5_0:1"
>
> mpirun -machinefile ${hostlist} \
>   --mca opal_common_ucx_opal_mem_hooks 1 \
>   -np $NP \
>   -N $rpn \
>   -vv \
>
> My question is there a way to take the '-mca
> opal_common_ucx_opal_mem_hooks 1 ' and make
> it into a environment variable like the others ?
>
> Thanks
>
> Ben Duncan - Business Network Solutions, Inc. 336 Elton Road Jackson MS,
> 39212
> "Never attribute to malice, that which can be adequately explained by
> stupidity"
> - Hanlon's Razor
>
>
>

Re: [OMPI users] Cannot build working Open MPI 4.1.1 with NAG Fortran/clang on macOS (but I could before!)

2021-10-28 Thread Gilles Gouaillardet via users

Matt,

did you build the same Open MPI 4.1.1 from an official tarball with the
previous NAG Fortran?
did you run autogen.pl (--force) ?

Just to be sure, can you rerun the same test with the previous NAG version?


When using static libraries, you can try manually linking with
-lopen-orted-mpir and see if it helps.
If you want to use shared libraries, I would try to run configure,
and then edit the generated libtool file:
look a line like

CC="nagfor"

and then edit the next line


# Commands used to build a shared archive.

archive_cmds="\$CC -dynamiclib \$allow_undef ..."

simply manually remove "-dynamiclib" here and see if it helps


Cheers,

Gilles
On Fri, Oct 29, 2021 at 12:30 AM Matt Thompson via users <
users@lists.open-mpi.org> wrote:

> Dear Open MPI Gurus,
>
> This is a...confusing one. For some reason, I cannot build a working Open
> MPI with NAG 7.0.7062 and clang on my MacBook running macOS 11.6.1. The
> thing is, I could do this back in July with NAG 7.0.7048. So my fear is
> that something changed with macOS, or clang/xcode, or something in between.
>
> So here are the symptoms, I usually build with a few extra flags that I've
> always carried around but for now I'm going to go basic. First, I try to
> build Open MPI in a basic way:
>
> ../configure FCFLAGS"=-mismatch_all -fpp" CC=clang CXX=clang++ FC=nagfor
> --prefix=$HOME/installed/Compiler/nag-7.0_7062/openmpi/4.1.1-basic |& tee
> configure.log
>
> Note that the FCFLAGS are needed for NAG since it doesn't preprocess .F90
> files by default (so -fpp) and it can be *very* strict with interfaces and
> any slight interface difference is an error so we use -mismatch_all.
>
> Now with this configure line, I then build and:
>
> Making all in mpi/fortran/use-mpi-tkr
> make[2]: Entering directory
> '/Users/mathomp4/src/MPI/openmpi-4.1.1/build-basic/ompi/mpi/fortran/use-mpi-tkr'
>   FCLD libmpi_usempi.la
> NAG Fortran Compiler Release 7.0(Yurakucho) Build 7062
> Option error: Unrecognised option -dynamiclib
> make[2]: *** [Makefile:1966: libmpi_usempi.la] Error 2
> make[2]: Leaving directory
> '/Users/mathomp4/src/MPI/openmpi-4.1.1/build-basic/ompi/mpi/fortran/use-mpi-tkr'
> make[1]: *** [Makefile:3555: all-recursive] Error 1
> make[1]: Leaving directory
> '/Users/mathomp4/src/MPI/openmpi-4.1.1/build-basic/ompi'
> make: *** [Makefile:1901: all-recursive] Error 1
>
> For some reason, the make system is trying to pass a clang option,
> -dynamiclib, to nagfor and it fails. With verbose on:
>
> libtool: link: nagfor -dynamiclib -Wl,-Wl,,-undefined
> -Wl,-Wl,,dynamic_lookup -o .libs/libmpi_usempi.40.dylib  .libs/mpi.o
> .libs/mpi_aint_add_f90.o .libs/mpi_aint_diff_f90.o
> .libs/mpi_comm_spawn_multiple_f90.o .libs/mpi_testall_f90.o
> .libs/mpi_testsome_f90.o .libs/mpi_waitall_f90.o .libs/mpi_waitsome_f90.o
> .libs/mpi_wtick_f90.o .libs/mpi_wtime_f90.o .libs/mpi-tkr-sizeof.o...
>
> As a test, I tried the same thing with NAG 7.0.7048 (which worked in July)
> and I get the same issue:
>
> Option error: Unrecognised option -dynamiclib
>
> Note, that Intel Fortran and Gfortran *do* support this flag, but NAG has
> something like:
>
>-Bbinding Specify  static  or  dynamic binding.  This only has
> effect if specified during the link phase.  The default is dynamic binding.
>
> but maybe the Open MPI system doesn't know NAG?
>
> So I say to myself, okay, dynamiclib is a shared library sounding thing,
> so let's try static library build! So, following the documentation I try:
>
> ../configure --enable-static -disable-shared FCFLAGS"=-mismatch_all -fpp"
> CC=gcc CXX=g++ FC=nagfor
> --prefix=$HOME/installed/Compiler/nag-7.0_7062/openmpi/4.1.1-static |& tee
> configure.log
>
> and it builds! Yay! And then I try to build helloworld.c and it fails! To
> wit:
> ❯ cat helloworld.c
> /*The Parallel Hello World Program*/
> #include 
> #include 
>
> int main(int argc, char **argv)
> {
>int node;
>
>MPI_Init(,);
>MPI_Comm_rank(MPI_COMM_WORLD, );
>
>printf("Hello World from Node %d\n",node);
>
>MPI_Finalize();
> }
> ❯
> /Users/mathomp4/installed/Compiler/nag-7.0_7062/openmpi/4.1.1-static/bin/mpicc
> helloworld.c
> Undefined symbols for architecture x86_64:
>   "_MPIR_Breakpoint", referenced from:
>   _orte_debugger_init_after_spawn in libopen-rte.a(orted_submit.o)
>   "_MPIR_attach_fifo", referenced from:
>   _orte_submit_finalize in libopen-rte.a(orted_submit.o)
>   _orte_submit_job in libopen-rte.a(orted_submit.o)
>   _open_fifo in libopen-rte.a(orted_submit.o)
>   "_MPIR_being_debugged", referenced from:
>   _ompi_rte_wait_for_debugger in libmpi.a(rte_orte_module.o)
>   _orte_submit_job in libopen-rte.a(orted_submit.o)
>   _orte_debugger_init_after_spawn in libopen-rte.a(orted_submit.o)
>   _attach_debugger in libopen-rte.a(orted_submit.o)
>   "_MPIR_debug_state", referenced from:
>   _orte_debugger_init_after_spawn in libopen-rte.a(orted_submit.o)
>   "_MPIR_executable_path", referenced from:
>

Re: [OMPI users] OpenMPI 4.1.1, CentOS 7.9, nVidia HPC-SDk, build hints?

2021-09-30 Thread Gilles Gouaillardet via users

Carl,

I opened https://github.com/open-mpi/ompi/issues/9444 to specifically track
the issue related to the op/avx component

TL;DR
nvhpc compilers can compile AVX512 intrinsics (so far so good), but do not
define at least one of these macros
__AVX512BW__
 __AVX512F__
__AVX512VL__
and Open MPI is not happy about it.

If you can have all these 3 macros defined by the nvhpc compilers, that
would be great!
Otherwise, I will let George decide if and how Open MPI addresses this issue

Cheers,

Gilles

On Thu, Sep 30, 2021 at 11:33 PM Carl Ponder via users <
users@lists.open-mpi.org> wrote:

>
> Are you able to fix the problem by adding extern qualifiers?
> We could push this change back to the OpenMPI developers...
>
> --
> Subject: Re: [OMPI users] OpenMPI 4.1.1, CentOS 7.9, nVidia HPC-SDk,
> build hints?
> Date: Thu, 30 Sep 2021 08:46:26 -0400
> From: Bennet Fauber  
> To: Carl Ponder  , Open MPI Users
>  
> CC: Ray Muno  
>
> *External email: Use caution opening links or attachments*
> You may be seeing this?
>
> C language issues Default to -fno-common
>
> A common mistake in C is omitting extern when declaring a global variable
> in a header file. If the header is included by several files it results in
> multiple definitions of the same variable. In previous GCC versions this
> error is ignored. GCC 10 defaults to -fno-common, which means a linker
> error will now be reported. To fix this, use extern in header files when
> declaring global variables, and ensure each global is defined in exactly
> one C file. If tentative definitions of particular variables need to be
> placed in a common block, __attribute__((__common__)) can be used to
> force that behavior even in code compiled without -fcommon. As a
> workaround, legacy C code where all tentative definitions should be placed
> into a common block can be compiled with -fcommon.
>
> https://gcc.gnu.org/gcc-10/porting_to.html
> 
>
>
> On Thu, Sep 30, 2021 at 6:56 AM Carl Ponder via users <
> users@lists.open-mpi.org> wrote:
>
>>
>> For now, you can suppress this error building OpenMPI 4.1.1
>>
>> ./.libs/liblocal_ops_avx512.a(liblocal_ops_avx512_la-op_avx_functions.o):(.data+0x0):
>> multiple definition of `ompi_op_avx_functions_avx2'
>> ./.libs/liblocal_ops_avx2.a(liblocal_ops_avx2_la-op_avx_functions.o):(.data+0x0):
>> first defined here
>> ./.libs/liblocal_ops_avx512.a(liblocal_ops_avx512_la-op_avx_functions.o):
>> In function`ompi_op_avx_2buff_min_uint16_t_avx2':
>> /project/muno/OpenMPI/BUILD/SRC/openmpi-4.1.1/ompi/mca/op/avx/op_avx_functions.c:651:
>> multiple definition of `ompi_op_avx_3buff_functions_avx2'
>>
>> ./.libs/liblocal_ops_avx2.a(liblocal_ops_avx2_la-op_avx_functions.o):/project/muno/OpenMPI/BUILD/SRC/openmpi-4.1.1/ompi/mca/op/avx/op_avx_functions.c:651:
>> first defined here
>>
>> with the NVHPC/PGI 21.9 compiler by using the setting
>>
>> configure -*-enable-mca-no-build=op-avx* ...
>>
>> We're still looking at the cause here. I don't have any advice about the
>> problem with 21.7.
>>
>> --
>> Subject: Re: [OMPI users] OpenMPI 4.1.1, CentOS 7.9, nVidia HPC-SDk,
>> build hints?
>> Date: Wed, 29 Sep 2021 12:25:43 -0500
>> From: Ray Muno via users 
>> 
>> Reply-To: Open MPI Users 
>> 
>> To: users@lists.open-mpi.org
>> CC: Ray Muno  
>>
>> External email: Use caution opening links or attachments
>>
>>
>> Tried this
>>
>> configure CC='nvc -fPIC' CXX='nvc++ -fPIC' FC='nvfortran -fPIC'
>>
>> Configure completes. Compiles quite a way through. Dies in a different
>> place. It does get past the
>> first error, however with libmpi_usempif08.la
>> 
>>
>>
>> FCLD libmpi_usempif08.la
>> 
>> make[2]: Leaving directory
>>
>> `/project/muno/OpenMPI/BUILD/4.1.1/ROME/NV-HPC/21.9/ompi/mpi/fortran/use-mpi-f08'
>> Making all in mpi/fortran/mpiext-use-mpi-f08
>> make[2]: Entering directory

Re: [OMPI users] OpenMPI 4.1.1, CentOS 7.9, nVidia HPC-SDk, buildhints?

2021-09-30 Thread Gilles Gouaillardet via users

 Ray,



there is a typo, the configure option is

--enable-mca-no-build=op-avx



Cheers,



Gilles

- Original Message -

 Added --enable-mca-no-build=op-avx to the configure line. Still dies in 
the same place. 


CCLD mca_op_avx.la
./.libs/liblocal_ops_avx512.a(liblocal_ops_avx512_la-op_avx_functions.o):
(.data+0x0): multiple definition of `ompi_op_avx_functions_avx2'
./.libs/liblocal_ops_avx2.a(liblocal_ops_avx2_la-op_avx_functions.o):(.
data+0x0): first defined here
./.libs/liblocal_ops_avx512.a(liblocal_ops_avx512_la-op_avx_functions.o):
 In function `ompi_op_avx_2buff_min_uint16_t_avx2':
/project/muno/OpenMPI/BUILD/SRC/openmpi-4.1.1/ompi/mca/op/avx/op_avx_
functions.c:651: multiple definition of `ompi_op_avx_3buff_functions_
avx2'
./.libs/liblocal_ops_avx2.a(liblocal_ops_avx2_la-op_avx_functions.o):/
project/muno/OpenMPI/BUILD/SRC/openmpi-4.1.1/ompi/mca/op/avx/op_avx_
functions.c:651: first defined here
make[2]: *** [mca_op_avx.la] Error 2
make[2]: Leaving directory `/project/muno/OpenMPI/BUILD/4.1.1/ROME/NV-
HPC/21.9/ompi/mca/op/avx'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/project/muno/OpenMPI/BUILD/4.1.1/ROME/NV-
HPC/21.9/ompi'
make: *** [all-recursive] Error 1


On 9/30/21 5:54 AM, Carl Ponder wrote:

For now, you can suppress this error building OpenMPI 4.1.1
./.libs/liblocal_ops_avx512.a(liblocal_ops_avx512_la-op_avx_functions.o):
(.data+0x0): multiple definition of `ompi_op_avx_functions_avx2' 
./.libs/liblocal_ops_avx2.a(liblocal_ops_avx2_la-op_avx_functions.o):(.
data+0x0): first defined here 
./.libs/liblocal_ops_avx512.a(liblocal_ops_avx512_la-op_avx_functions.o):
 In function `ompi_op_avx_2buff_min_uint16_t_avx2': 
/project/muno/OpenMPI/BUILD/SRC/openmpi-4.1.1/ompi/mca/op/avx/op_avx_
functions.c:651: multiple definition of `ompi_op_avx_3buff_functions_
avx2' 
./.libs/liblocal_ops_avx2.a(liblocal_ops_avx2_la-op_avx_functions.o):/
project/muno/OpenMPI/BUILD/SRC/openmpi-4.1.1/ompi/mca/op/avx/op_avx_
functions.c:651: 
first defined here
with the NVHPC/PGI 21.9 compiler by using the setting
configure --enable-mca-no-build=op-avx ... 
We're still looking at the cause here. I don't have any advice about the 
problem with 21.7.

Subject:Re: [OMPI users] OpenMPI 4.1.1, CentOS 7.9, nVidia HPC-SDk, 
build hints?
Date:   Wed, 29 Sep 2021 12:25:43 -0500
From:   Ray Muno via users 
Reply-To:   Open MPI Users 
To: users@lists.open-mpi.org
CC: Ray Muno 


External email: Use caution opening links or attachments


Tried this

configure CC='nvc -fPIC' CXX='nvc++ -fPIC' FC='nvfortran -fPIC'

Configure completes. Compiles quite a way through. Dies in a different 
place. It does get past the
first error, however with libmpi_usempif08.la


FCLD libmpi_usempif08.la
make[2]: Leaving directory
`/project/muno/OpenMPI/BUILD/4.1.1/ROME/NV-HPC/21.9/ompi/mpi/fortran/use
-mpi-f08'
Making all in mpi/fortran/mpiext-use-mpi-f08
make[2]: Entering directory
`/project/muno/OpenMPI/BUILD/4.1.1/ROME/NV-HPC/21.9/ompi/mpi/fortran/
mpiext-use-mpi-f08'
PPFC mpi-f08-ext-module.lo
FCLD libforce_usempif08_module_to_be_built.la
make[2]: Leaving directory
`/project/muno/OpenMPI/BUILD/4.1.1/ROME/NV-HPC/21.9/ompi/mpi/fortran/
mpiext-use-mpi-f08'

Dies here now.

CCLD liblocal_ops_avx512.la
CCLD mca_op_avx.la
./.libs/liblocal_ops_avx512.a(liblocal_ops_avx512_la-op_avx_functions.o):
(.data+0x0): multiple
definition of `ompi_op_avx_functions_avx2'
./.libs/liblocal_ops_avx2.a(liblocal_ops_avx2_la-op_avx_functions.o):(.
data+0x0): first defined here
./.libs/liblocal_ops_avx512.a(liblocal_ops_avx512_la-op_avx_functions.o):
 In function
`ompi_op_avx_2buff_min_uint16_t_avx2':
/project/muno/OpenMPI/BUILD/SRC/openmpi-4.1.1/ompi/mca/op/avx/op_avx_
functions.c:651: multiple
definition of `ompi_op_avx_3buff_functions_avx2'
./.libs/liblocal_ops_avx2.a(liblocal_ops_avx2_la-op_avx_functions.o):/
project/muno/OpenMPI/BUILD/SRC/openmpi-4.1.1/ompi/mca/op/avx/op_avx_
functions.c:651:
first defined here
make[2]: *** [mca_op_avx.la] Error 2
make[2]: Leaving directory `/project/muno/OpenMPI/BUILD/4.1.1/ROME/NV-
HPC/21.9/ompi/mca/op/avx'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/project/muno/OpenMPI/BUILD/4.1.1/ROME/NV-
HPC/21.9/ompi'
make: *** [all-recursive] Error 1


On 9/29/21 11:42 AM, Bennet Fauber via users wrote:
Ray,

If all the errors about not being compiled with -fPIC are still 
appearing, there may be a bug that
is preventing the option from getting through to the compiler(s).  It 
might be worth looking through
the logs to see the full compile command for one or more of them to see 
whether that is true?  Say,
libs/comm_spawn_multiple_f08.o for example?

If -fPIC is missing, you may be able to recompile that manually with the 
-fPIC in place, then remake
and see if that also causes the link error to go away, that would be a 
good start.

Hope this helps,-- bennet



On Wed, Sep 29, 2021 at 12:29 PM Ray Muno via users mailto:users@lists.open-mpi.org>> wrote:

Re: [OMPI users] OpenMPI 4.1.1, CentOS 7.9, nVidia HPC-SDk, build hints?

2021-09-29 Thread Gilles Gouaillardet via users

Ray,

note there is a bug in nvc compilers since 21.3
(it has been reported and is documented at
https://github.com/open-mpi/ompi/issues/9402)

For the time being, I suggest you use gcc, g++ and nvfortran

FWIW, the AVX2 issue is likely caused by nvc **not** defining some macros
(that are both defined by at least GNU and LLVM compilers),
I will take a look at it when I get some time (you won't face this issue if
you use GNU compilers for C/C++)


Cheers,

Gilles

On Thu, Sep 30, 2021 at 2:31 AM Ray Muno via users 
wrote:

>
> Tried this
>
> configure   CC='nvc -fPIC' CXX='nvc++ -fPIC' FC='nvfortran -fPIC'
>
> Configure completes. Compiles quite a way through. Dies in a different
> place. It does get past the
> first error, however with libmpi_usempif08.la
>
>
>FCLD libmpi_usempif08.la
> make[2]: Leaving directory
>
> `/project/muno/OpenMPI/BUILD/4.1.1/ROME/NV-HPC/21.9/ompi/mpi/fortran/use-mpi-f08'
> Making all in mpi/fortran/mpiext-use-mpi-f08
> make[2]: Entering directory
>
> `/project/muno/OpenMPI/BUILD/4.1.1/ROME/NV-HPC/21.9/ompi/mpi/fortran/mpiext-use-mpi-f08'
>PPFC mpi-f08-ext-module.lo
>FCLD libforce_usempif08_module_to_be_built.la
> make[2]: Leaving directory
>
> `/project/muno/OpenMPI/BUILD/4.1.1/ROME/NV-HPC/21.9/ompi/mpi/fortran/mpiext-use-mpi-f08'
>
> Dies here now.
>
>   CCLD liblocal_ops_avx512.la
>CCLD mca_op_avx.la
> ./.libs/liblocal_ops_avx512.a(liblocal_ops_avx512_la-op_avx_functions.o):(.data+0x0):
> multiple
> definition of `ompi_op_avx_functions_avx2'
> ./.libs/liblocal_ops_avx2.a(liblocal_ops_avx2_la-op_avx_functions.o):(.data+0x0):
> first defined here
> ./.libs/liblocal_ops_avx512.a(liblocal_ops_avx512_la-op_avx_functions.o):
> In function
> `ompi_op_avx_2buff_min_uint16_t_avx2':
> /project/muno/OpenMPI/BUILD/SRC/openmpi-4.1.1/ompi/mca/op/avx/op_avx_functions.c:651:
> multiple
> definition of `ompi_op_avx_3buff_functions_avx2'
> ./.libs/liblocal_ops_avx2.a(liblocal_ops_avx2_la-op_avx_functions.o):/project/muno/OpenMPI/BUILD/SRC/openmpi-4.1.1/ompi/mca/op/avx/op_avx_functions.c:651:
>
> first defined here
> make[2]: *** [mca_op_avx.la] Error 2
> make[2]: Leaving directory
> `/project/muno/OpenMPI/BUILD/4.1.1/ROME/NV-HPC/21.9/ompi/mca/op/avx'
> make[1]: *** [all-recursive] Error 1
> make[1]: Leaving directory
> `/project/muno/OpenMPI/BUILD/4.1.1/ROME/NV-HPC/21.9/ompi'
> make: *** [all-recursive] Error 1
>
>
> On 9/29/21 11:42 AM, Bennet Fauber via users wrote:
> > Ray,
> >
> > If all the errors about not being compiled with -fPIC are still
> appearing, there may be a bug that
> > is preventing the option from getting through to the compiler(s).  It
> might be worth looking through
> > the logs to see the full compile command for one or more of them to see
> whether that is true?  Say,
> > libs/comm_spawn_multiple_f08.o for example?
> >
> > If -fPIC is missing, you may be able to recompile that manually with the
> -fPIC in place, then remake
> > and see if that also causes the link error to go away, that would be a
> good start.
> >
> > Hope this helps,-- bennet
> >
> >
> >
> > On Wed, Sep 29, 2021 at 12:29 PM Ray Muno via users <
> users@lists.open-mpi.org
> > > wrote:
> >
> > I did try that and it fails at the same place.
> >
> > Which version of the nVidia HPC-SDK are you using? I a m using
> 21.7.  I see there is an upgrade to
> > 21.9, which came out since I installed.  I have that installed and
> will try to see if they changed
> > anything. Not much in the releases notes to indicate any major
> changes.
> >
> > -Ray Muno
> >
> >
> > On 9/29/21 10:54 AM, Jing Gong wrote:
> >  > Hi,
> >  >
> >  >
> >  > Before Nvidia persons look into details,pProbably you can try to
> add the flag "-fPIC" to the
> >  > nvhpc compiler likes cc="nvc -fPIC", which at least worked with
> me.
> >  >
> >  >
> >  >
> >  > /Jing
> >  >
> >  >
> >
>  
> 
> >  > *From:* users  users-boun...@lists.open-mpi.org>> on
> > behalf of Ray Muno via users
> >  > mailto:users@lists.open-mpi.org>>
> >  > *Sent:* Wednesday, September 29, 2021 17:22
> >  > *To:* Open MPI User's List
> >  > *Cc:* Ray Muno
> >  > *Subject:* Re: [OMPI users] OpenMPI 4.1.1, CentOS 7.9, nVidia
> HPC-SDk, build hints?
> >  > Thanks, I looked through previous emails here in the user list.
> I guess I need to subscribe
> > to the
> >  > Developers list.
> >  >
> >  > -Ray Muno
> >  >
> >  > On 9/29/21 9:58 AM, Jeff Squyres (jsquyres) wrote:
> >  >> Ray --
> >  >>
> >  >> Looks like this is a dup of
> https://github.com/open-mpi/ompi/issues/8919
> >  <
> https://github.com/open-mpi/ompi/issues/8919
> > >
> >  >>

Re: [OMPI users] (Fedora 34, x86_64-pc-linux-gnu, openmpi-4.1.1.tar.gz): PML ucx cannot be selected

2021-09-13 Thread Gilles Gouaillardet via users

Jorge,

I am not that familiar with UCX, but I hope that will help:

The changes I mentioned were introduced by
https://github.com/open-mpi/ompi/pull/8549

I suspect mpirun --mca pml_ucx_tls any --mca pml_ucx_devices --mca pml ucx
...

will do what you expect


Cheers,

Gilles

On Mon, Sep 13, 2021 at 9:05 PM Jorge D'Elia via users <
users@lists.open-mpi.org> wrote:

> Dear Gilles,
>
> Despite my last answer (see below), I am noticing that
> some tests with a coarray fortran code on a laptop show a
> performance drop of the order of 20% using the 4.1.1 version
> (with --mca pml ucx disabled), versus the 4.1.0 one
> (with --mca pml ucx enabled).
>
> I would like to experiment with pml/ucx framework using the 4.1.0
> version on that laptop. Then, please, how do I manually re-enable
> those providers? (e.g. perhaps, is it during the construction
> stage?) or where can I find out how to do it? Thanks in advance.
>
> Regards.
> Jorge.
>
> - Mensaje original -
> > De: "Open MPI Users" 
> > Para: "Open MPI Users" 
> > CC: "Jorge D'Elia"
> > Enviado: Sábado, 29 de Mayo 2021 7:18:23
> > Asunto: Re: [OMPI users] (Fedora 34, x86_64-pc-linux-gnu,
> openmpi-4.1.1.tar.gz): PML ucx cannot be selected
> >
> > Dear Gilles,
> >
> > Ahhh ... now the new behavior is better understood.
> > The intention of using pml/ucx was simply for preliminary
> > testing, and does not merit re-enabling these providers in
> > this notebook.
> >
> > Thank you very much for the clarification.
> >
> > Regards,
> > Jorge.
> >
> > - Mensaje original -
> >> De: "Gilles Gouaillardet"
> >> Para: "Jorge D'Elia" , "Open MPI Users" 
> >> Enviado: Viernes, 28 de Mayo 2021 23:35:37
> >> Asunto: Re: [OMPI users] (Fedora 34, x86_64-pc-linux-gnu,
> openmpi-4.1.1.tar.gz):
> >> PML ucx cannot be selected
> >>
> >> Jorge,
> >>
> >> pml/ucx used to be selected when no fast interconnect were detected
> >> (since ucx provides driver for both TCP and shared memory).
> >> These providers are now disabled by default, so unless your machine
> >> has a supported fast interconnect (such as Infiniband),
> >> pml/ucx cannot be used out of the box anymore.
> >>
> >> if you really want to use pml/ucx on your notebook, you need to
> >> manually re-enable these providers.
> >>
> >> That being said, your best choice here is really not to force any pml,
> >> and let Open MPI use pml/ob1
> >> (that has support for both TCP and shared memory)
> >>
> >> Cheers,
> >>
> >> Gilles
> >>
> >> On Sat, May 29, 2021 at 11:19 AM Jorge D'Elia via users
> >>  wrote:
> >>>
> >>> Hi,
> >>>
> >>> We routinely build OpenMPI on x86_64-pc-linux-gnu machines from
> >>> the sources using gcc and usually everything works fine.
> >>>
> >>> In one case we recently installed Fedora 34 from scratch on an
> >>> ASUS G53SX notebook (Intel Core i7-2630QM CPU 2.00GHz ×4 cores,
> >>> without any IB device). Next we build OpenMPI using the file
> >>> openmpi-4.1.1.tar.gz and the GCC 12.0.0 20210524 (experimental)
> >>> compiler.
> >>>
> >>> However, when trying to experiment OpenMPI using UCX
> >>> with a simple test, we get the runtime errors:
> >>>
> >>>   No components were able to be opened in the btl framework.
> >>>   PML ucx cannot be selected
> >>>
> >>> while the test worked fine until Fedora 33 on the same
> >>> machine using the same OpenMPI configuration.
> >>>
> >>> We attach below some info about a simple test run.
> >>>
> >>> Please, any clues where to check or maybe something is missing?
> >>> Thanks in advance.
> >>>
> >>> Regards
> >>> Jorge.
> >>>
> >>> --
> >>> $ cat /proc/version
> >>> Linux version 5.12.7-300.fc34.x86_64
> >>> (mockbu...@bkernel01.iad2.fedoraproject.org) (gcc (GCC) 11.1.1
> 20210428 (Red
> >>> Hat 11.1.1-1), GNU ld version 2.35.1-41.fc34) #1 SMP Wed May 26
> 12:58:58 UTC
> >>> 2021
> >>>
> >>> $ mpifort --version
> >>> GNU Fortran (GCC) 12.0.0 20210524 (experimental)
> >>> Copyright (C) 2021 Free Software Foundation, Inc.
> >>>
> >>> $ which mpifort
> >>> /usr/beta/openmpi

Re: [OMPI users] cross-compilation documentation seems to be missing

2021-09-07 Thread Gilles Gouaillardet via users


Hi Jeff,


Here is a sample file I used some times ago (some definitions might be 
missing though ...)



In order to automatically generate this file - this is a bit of an egg 
and the chicken problem -


you can run

configure -c

on the RISC-V node. It will generate a config.cache file.

Then you can

grep ^ompi_cv_fortran_ config.cache

to generate the file you can pass to --with-cross when cross compiling 
on your x86 system



Cheers,


Gilles


On 9/7/2021 7:35 PM, Jeff Hammond via users wrote:
I am attempting to cross-compile Open-MPI for RISC-V on an x86 
system.  I get this error, with which I have some familiarity:


checking size of Fortran CHARACTER... configure: error: Can not 
determine size of CHARACTER when cross-compiling


I know that I need to specify the size explicitly using a 
cross-compilation file.  According to configure, this is documented.


--with-cross=FILE       Specify configure values that can not be 
determined in a cross-compilation environment. See the Open MPI FAQ.


Where is this documented? 
https://www.open-mpi.org/faq/?category=building 
 contains nothing 
relevant.


Thanks,

Jeff

--
Jeff Hammond
jeff.scie...@gmail.com 
http://jeffhammond.github.io/ 
ompi_cv_fortran_alignment_CHARACTER=${ompi_cv_fortran_alignment_CHARACTER=1}
ompi_cv_fortran_alignment_COMPLEX=${ompi_cv_fortran_alignment_COMPLEX=4}
ompi_cv_fortran_alignment_COMPLEXp16=${ompi_cv_fortran_alignment_COMPLEXp16=4}
ompi_cv_fortran_alignment_COMPLEXp32=${ompi_cv_fortran_alignment_COMPLEXp32=4}
ompi_cv_fortran_alignment_COMPLEXp8=${ompi_cv_fortran_alignment_COMPLEXp8=4}
ompi_cv_fortran_alignment_DOUBLE_COMPLEX=${ompi_cv_fortran_alignment_DOUBLE_COMPLEX=4}
ompi_cv_fortran_alignment_DOUBLE_PRECISION=${ompi_cv_fortran_alignment_DOUBLE_PRECISION=4}
ompi_cv_fortran_alignment_INTEGER=${ompi_cv_fortran_alignment_INTEGER=4}
ompi_cv_fortran_alignment_INTEGERp1=${ompi_cv_fortran_alignment_INTEGERp1=1}
ompi_cv_fortran_alignment_INTEGERp2=${ompi_cv_fortran_alignment_INTEGERp2=2}
ompi_cv_fortran_alignment_INTEGERp4=${ompi_cv_fortran_alignment_INTEGERp4=4}
ompi_cv_fortran_alignment_INTEGERp8=${ompi_cv_fortran_alignment_INTEGERp8=8}
ompi_cv_fortran_alignment_LOGICAL=${ompi_cv_fortran_alignment_LOGICAL=4}
ompi_cv_fortran_alignment_LOGICALp1=${ompi_cv_fortran_alignment_LOGICALp1=1}
ompi_cv_fortran_alignment_LOGICALp2=${ompi_cv_fortran_alignment_LOGICALp2=2}
ompi_cv_fortran_alignment_LOGICALp4=${ompi_cv_fortran_alignment_LOGICALp4=4}
ompi_cv_fortran_alignment_LOGICALp8=${ompi_cv_fortran_alignment_LOGICALp8=8}
ompi_cv_fortran_alignment_REAL=${ompi_cv_fortran_alignment_REAL=4}
ompi_cv_fortran_alignment_REALp16=${ompi_cv_fortran_alignment_REALp16=4}
ompi_cv_fortran_alignment_REALp4=${ompi_cv_fortran_alignment_REALp4=4}
ompi_cv_fortran_alignment_REALp8=${ompi_cv_fortran_alignment_REALp8=4}
ompi_cv_fortran_external_symbol=${ompi_cv_fortran_external_symbol='single 
underscore'}
ompi_cv_fortran_handle_max=${ompi_cv_fortran_handle_max=2147483647}
ompi_cv_fortran_have_CHARACTER=${ompi_cv_fortran_have_CHARACTER=yes}
ompi_cv_fortran_have_COMPLEX=${ompi_cv_fortran_have_COMPLEX=yes}
ompi_cv_fortran_have_COMPLEXp16=${ompi_cv_fortran_have_COMPLEXp16=yes}
ompi_cv_fortran_have_COMPLEXp32=${ompi_cv_fortran_have_COMPLEXp32=yes}
ompi_cv_fortran_have_COMPLEXp4=${ompi_cv_fortran_have_COMPLEXp4=no}
ompi_cv_fortran_have_COMPLEXp8=${ompi_cv_fortran_have_COMPLEXp8=yes}
ompi_cv_fortran_have_DOUBLE_COMPLEX=${ompi_cv_fortran_have_DOUBLE_COMPLEX=yes}
ompi_cv_fortran_have_DOUBLE_PRECISION=${ompi_cv_fortran_have_DOUBLE_PRECISION=yes}
ompi_cv_fortran_have_INTEGER=${ompi_cv_fortran_have_INTEGER=yes}
ompi_cv_fortran_have_INTEGERp16=${ompi_cv_fortran_have_INTEGERp16=no}
ompi_cv_fortran_have_INTEGERp1=${ompi_cv_fortran_have_INTEGERp1=yes}
ompi_cv_fortran_have_INTEGERp2=${ompi_cv_fortran_have_INTEGERp2=yes}
ompi_cv_fortran_have_INTEGERp4=${ompi_cv_fortran_have_INTEGERp4=yes}
ompi_cv_fortran_have_INTEGERp8=${ompi_cv_fortran_have_INTEGERp8=yes}
ompi_cv_fortran_have_LOGICAL=${ompi_cv_fortran_have_LOGICAL=yes}
ompi_cv_fortran_have_LOGICALp1=${ompi_cv_fortran_have_LOGICALp1=yes}
ompi_cv_fortran_have_LOGICALp2=${ompi_cv_fortran_have_LOGICALp2=yes}
ompi_cv_fortran_have_LOGICALp4=${ompi_cv_fortran_have_LOGICALp4=yes}
ompi_cv_fortran_have_LOGICALp8=${ompi_cv_fortran_have_LOGICALp8=yes}
ompi_cv_fortran_have_REAL=${ompi_cv_fortran_have_REAL=yes}
ompi_cv_fortran_have_REALp16=${ompi_cv_fortran_have_REALp16=yes}
ompi_cv_fortran_have_REALp2=${ompi_cv_fortran_have_REALp2=no}
ompi_cv_fortran_have_REALp4=${ompi_cv_fortran_have_REALp4=yes}
ompi_cv_fortran_have_REALp8=${ompi_cv_fortran_have_REALp8=yes}
ompi_cv_fortran_have_iso_c_binding=${ompi_cv_fortran_have_iso_c_binding=yes}
ompi_cv_fortran_have_iso_fortran_env=${ompi_cv_fortran_have_iso_fortran_env=yes}
ompi_cv_fortran_have_storage_size=${ompi_cv_fortran_have_storage_size=yes}

Re: [OMPI users] OpenMPI-4.0.5 and MPI_spawn

2021-08-26 Thread Gilles Gouaillardet via users


Indeed ...


I am not 100% sure the two errors are unrelated, but anyway,


That examples passes with Open MPI 4.0.1 and 4.0.6 and crashed with the 
versions in between.


It also passes with the 4.1 and master branches


Bottom line, upgrade Open MPI to a latest version and you should be fine.



Cheers,


Gilles

On 8/26/2021 2:42 PM, Broi, Franco via users wrote:


Thanks Gilles but no go...

/usr/lib64/openmpi/bin/mpirun -c 1 --mca pml ^ucx 
/home/franco/spawn_example 47


I'm the parent on fsc07
Starting 47 children

  Process 1 ([[48649,2],32]) is on host: fsc08
  Process 2 ([[48649,1],0]) is on host: unknown!
  BTLs attempted: vader tcp self

Your MPI job is now going to abort; sorry.

[fsc08:465159] [[45369,2],27] ORTE_ERROR_LOG: Unreachable in file 
dpm/dpm.c at line 493


On Thu, 2021-08-26 at 14:30 +0900, Gilles Gouaillardet via users wrote:

Franco,

I am surprised UCX gets selected since there is no Infiniband network.
There used to be a bug that lead UCX to be selected on shm/tcp 
systems, but
it has been fixed. You might want to give a try to the latest 
versions of Open MPI

(4.0.6 or 4.1.1)

Meanwhile, try to
mpirun --mca pml ^ucx ...
and see if it helps


Cheers,

Gilles

On Thu, Aug 26, 2021 at 2:13 PM Broi, Franco via users 
mailto:users@lists.open-mpi.org>> wrote:

Hi,

I have 2 example progs that I found on the internet (attached) that 
illustrate a problem we are having launching multiple node jobs with 
OpenMPI-4.0.5 and MPI_spawn


CentOS Linux release 8.4.2105
openmpi-4.0.5-3.el8.x86_64
Slum 20.11.8

10Gbit ethernet network, no IB or other networks

I allocate 2 nodes, each with 24 cores. They are identical systems 
with a shared NFS root.


salloc -p fsc -w fsc07,fsc08 --ntasks-per-node=24

Running the hello prog with OpenMPI 4.0.5

/usr/lib64/openmpi/bin/mpirun --version
mpirun (Open MPI) 4.0.5

*/usr/lib64/openmpi/bin/mpirun /home/franco/hello*

MPI_Init(): 307.434000
hello, world (rank 0 of 48 fsc07)
...
MPI_Init(): 264.714000
hello, world (rank 47 of 48 fsc08)

All well and good.

Now running the MPI_spawn example prog with OpenMPI 4.0.1

*/library/mpi/openmpi-4.0.1//bin/mpirun -c 1 
/home/franco/spawn_example 47*


I'm the parent on fsc07
Starting 47 children

I'm the spawned.
hello, world (rank 0 of 47 fsc07)
Received 999 err 0 (rank 0 of 47 fsc07)
I'm the spawned.
hello, world (rank 1 of 47 fsc07)
Received 999 err 0 (rank 1 of 47 fsc07)

I'm the spawned.
hello, world (rank 45 of 47 fsc08)
Received 999 err 0 (rank 45 of 47 fsc08)
I'm the spawned.
hello, world (rank 46 of 47 fsc08)
Received 999 err 0 (rank 46 of 47 fsc08)

Works fine.

Now rebuild spawn_example with 4.0.5 and run as before

ldd /home/franco/spawn_example | grep openmpi
libmpi.so.40 => /usr/lib64/openmpi/lib/libmpi.so.40 
(0x7fc2c0655000)
libopen-rte.so.40 => 
/usr/lib64/openmpi/lib/libopen-rte.so.40 (0x7fc2bfdb6000)
libopen-pal.so.40 => 
/usr/lib64/openmpi/lib/libopen-pal.so.40 (0x7fc2bfb08000)


/usr/lib64/openmpi/bin/mpirun --version
mpirun (Open MPI) 4.0.5

*/usr/lib64/openmpi/bin/mpirun -c 1 /home/franco/spawn_example 47*

I'm the parent on fsc07
Starting 47 children

[fsc08:463361] pml_ucx.c:178  Error: Failed to receive UCX worker address: Not 
found (-13)
[fsc08:463361] [[42596,2],32] ORTE_ERROR_LOG: Error in file dpm/dpm.c at line 
493

[fsc08:462917] pml_ucx.c:178  Error: Failed to receive UCX worker address: Not 
found (-13)
[fsc08:462917] [[42416,2],33] ORTE_ERROR_LOG: Error in file dpm/dpm.c at line 
493

   ompi_dpm_dyn_init() failed
   --> Returned "Error" (-1) instead of "Success" (0)
--
[fsc08:462926] *** An error occurred in MPI_Init
[fsc08:462926] *** reported by process [2779774978,42]
[fsc08:462926] *** on a NULL communicator
[fsc08:462926] *** Unknown error
[fsc08:462926] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will 
now abort,
[fsc08:462926] ***and potentially your MPI job)
[fsc07:1158342] *** An error occurred in MPI_Comm_spawn_multiple
[fsc07:1158342] *** reported by process [2779774977,0]
[fsc07:1158342] *** on communicator MPI_COMM_WORLD
[fsc07:1158342] *** MPI_ERR_OTHER: known error not in list
[fsc07:1158342] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will 
now abort,
[fsc07:1158342] ***and potentially your MPI job)
[1629952748.688500] [fsc07:1158342:0]   sock.c:244  UCX  ERROR 
connect(fd=64, dest_addr=
10.220.6.239:38471
<http://10.220.6.239:38471>
) failed: Connection refused

The IP address is for node fsc08, the program is being run from fsc07

I see the orted process running on fsc08 for both hello and 
spwan_example with the same arguments. I also tried turning on 
various debug options but I'm none the wiser.


If I run the spawn example with 23 children it works fine - because 
they are all on fsc07.


Any idea what might be wrong?

Cheers,
Franco

Re: [OMPI users] OpenMPI-4.0.5 and MPI_spawn

2021-08-25 Thread Gilles Gouaillardet via users

Franco,

I am surprised UCX gets selected since there is no Infiniband network.
There used to be a bug that lead UCX to be selected on shm/tcp systems, but
it has been fixed. You might want to give a try to the latest versions of
Open MPI
(4.0.6 or 4.1.1)

Meanwhile, try to
mpirun --mca pml ^ucx ...
and see if it helps


Cheers,

Gilles

On Thu, Aug 26, 2021 at 2:13 PM Broi, Franco via users <
users@lists.open-mpi.org> wrote:

> Hi,
>
> I have 2 example progs that I found on the internet (attached) that
> illustrate a problem we are having launching multiple node jobs with
> OpenMPI-4.0.5 and MPI_spawn
>
> CentOS Linux release 8.4.2105
> openmpi-4.0.5-3.el8.x86_64
> Slum 20.11.8
>
> 10Gbit ethernet network, no IB or other networks
>
> I allocate 2 nodes, each with 24 cores. They are identical systems with a
> shared NFS root.
>
> salloc -p fsc -w fsc07,fsc08 --ntasks-per-node=24
>
> Running the hello prog with OpenMPI 4.0.5
>
> /usr/lib64/openmpi/bin/mpirun --version
> mpirun (Open MPI) 4.0.5
>
> */usr/lib64/openmpi/bin/mpirun /home/franco/hello*
>
> MPI_Init(): 307.434000
> hello, world (rank 0 of 48 fsc07)
> ...
> MPI_Init(): 264.714000
> hello, world (rank 47 of 48 fsc08)
>
> All well and good.
>
> Now running the MPI_spawn example prog with OpenMPI 4.0.1
>
> */library/mpi/openmpi-4.0.1//bin/mpirun -c 1 /home/franco/spawn_example 47*
>
> I'm the parent on fsc07
> Starting 47 children
>
> I'm the spawned.
> hello, world (rank 0 of 47 fsc07)
> Received 999 err 0 (rank 0 of 47 fsc07)
> I'm the spawned.
> hello, world (rank 1 of 47 fsc07)
> Received 999 err 0 (rank 1 of 47 fsc07)
> 
> I'm the spawned.
> hello, world (rank 45 of 47 fsc08)
> Received 999 err 0 (rank 45 of 47 fsc08)
> I'm the spawned.
> hello, world (rank 46 of 47 fsc08)
> Received 999 err 0 (rank 46 of 47 fsc08)
>
> Works fine.
>
> Now rebuild spawn_example with 4.0.5 and run as before
>
> ldd /home/franco/spawn_example | grep openmpi
> libmpi.so.40 => /usr/lib64/openmpi/lib/libmpi.so.40
> (0x7fc2c0655000)
> libopen-rte.so.40 => /usr/lib64/openmpi/lib/libopen-rte.so.40
> (0x7fc2bfdb6000)
> libopen-pal.so.40 => /usr/lib64/openmpi/lib/libopen-pal.so.40
> (0x7fc2bfb08000)
>
> /usr/lib64/openmpi/bin/mpirun --version
> mpirun (Open MPI) 4.0.5
>
> */usr/lib64/openmpi/bin/mpirun -c 1 /home/franco/spawn_example 47*
>
> I'm the parent on fsc07
>
> Starting 47 children
>
>
> [fsc08:463361] pml_ucx.c:178  Error: Failed to receive UCX worker address: 
> Not found (-13)
>
> [fsc08:463361] [[42596,2],32] ORTE_ERROR_LOG: Error in file dpm/dpm.c at line 
> 493
>
> 
>
> [fsc08:462917] pml_ucx.c:178  Error: Failed to receive UCX worker address: 
> Not found (-13)
>
> [fsc08:462917] [[42416,2],33] ORTE_ERROR_LOG: Error in file dpm/dpm.c at line 
> 493
>
>
>   ompi_dpm_dyn_init() failed
>
>   --> Returned "Error" (-1) instead of "Success" (0)
>
> --
>
> [fsc08:462926] *** An error occurred in MPI_Init
>
> [fsc08:462926] *** reported by process [2779774978,42]
>
> [fsc08:462926] *** on a NULL communicator
>
> [fsc08:462926] *** Unknown error
>
> [fsc08:462926] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will 
> now abort,
>
> [fsc08:462926] ***and potentially your MPI job)
>
> [fsc07:1158342] *** An error occurred in MPI_Comm_spawn_multiple
>
> [fsc07:1158342] *** reported by process [2779774977,0]
>
> [fsc07:1158342] *** on communicator MPI_COMM_WORLD
>
> [fsc07:1158342] *** MPI_ERR_OTHER: known error not in list
>
> [fsc07:1158342] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will 
> now abort,
>
> [fsc07:1158342] ***and potentially your MPI job)
>
> [1629952748.688500] [fsc07:1158342:0]   sock.c:244  UCX  ERROR 
> connect(fd=64, dest_addr=10.220.6.239:38471) failed: Connection refused
>
>
> The IP address is for node fsc08, the program is being run from fsc07
>
> I see the orted process running on fsc08 for both hello and spwan_example
> with the same arguments. I also tried turning on various debug options but
> I'm none the wiser.
>
> If I run the spawn example with 23 children it works fine - because they
> are all on fsc07.
>
> Any idea what might be wrong?
>
> Cheers,
> Franco
>
>
>

Re: [OMPI users] vectorized reductions

2021-07-20 Thread Gilles Gouaillardet via users

You are welcome to provide any data that evidences the current 
implementation


(intrinsics, AVX512) is not the most efficient, and you are free to 
issue a Pull Request


in order to suggest a better one.


The op/avx component has pretty much nothing to do with scalability:

only one node is required to measure the performance, and the

test/datatype/reduce_local test can be used as a measurement.

/* several core counts should be used in order to fully evaluate the 
infamous AVX512 frequency downscaling */


The benefits of ap/avx (including AVX512) have been reported, for 
example at 
https://github.com/open-mpi/ompi/issues/8334#issuecomment-759864154



FWIW, George added SVE support in https://github.com/bosilca/ompi/pull/14,

and I added support for NEON and SVE in 
https://github.com/ggouaillardet/ompi/tree/topic/op_arm


None of these have been merged, but you are free to evaluate them and 
report the performance numbers.




On 7/20/2021 11:00 PM, Dave Love via users wrote:

Gilles Gouaillardet via users  writes:


One motivation is packaging: a single Open MPI implementation has to be
built, that can run on older x86 processors (supporting only SSE) and the
latest ones (supporting AVX512).

I take dispatch on micro-architecture for granted, but it doesn't
require an assembler/intrinsics implementation.  See the level-1
routines in recent BLIS, for example (an instance where GCC was supposed
to fail).  That works for all relevant architectures, though I don't
think the aarch64 and ppc64le dispatch was ever included.  Presumably
it's less prone to errors than low-level code.


The op/avx component will select at
runtime the most efficient implementation for vectorized reductions.

It will select the micro-architecture with the most features, which may
or may not be the most efficient.  Is the avx512 version actually faster
than avx2?

Anyway, if this is important at scale, which I can't test, please at
least vectorize op_base_functions.c for aarch64 and ppc64le.  With GCC,
and probably other compilers -- at least clang, I think -- it doesn't
even need changes to cc flags.  With GCC and recent glibc, target clones
cover micro-arches with practically no effort.  Otherwise you probably
need similar infrastructure to what's there now, but not to devote the
effort to using intrinsics as far as I can see.

Re: [OMPI users] vectorized reductions

2021-07-19 Thread Gilles Gouaillardet via users

One motivation is packaging: a single Open MPI implementation has to be
built, that can run on older x86 processors (supporting only SSE) and the
latest ones (supporting AVX512). The op/avx component will select at
runtime the most efficient implementation for vectorized reductions.

On Mon, Jul 19, 2021 at 11:11 PM Dave Love via users <
users@lists.open-mpi.org> wrote:

> I meant to ask a while ago about vectorized reductions after I saw a
> paper that I can't now find.  I didn't understand what was behind it.
>
> Can someone explain why you need to hand-code the avx implementations of
> the reduction operations now used on x86_64?  As far as I remember, the
> paper didn't justify the effort past alluding to a compiler being unable
> to vectorize reductions.  I wonder which compiler(s); the recent ones
> I'm familiar with certainly can if you allow them (or don't stop them --
> icc, sigh).  I've been assured before that GCC can't, but that's
> probably due to using the default correct FP compilation and/or not
> restricting function arguments.  So I wonder what's the problem just
> using C and a tolerably recent GCC if necessary -- is there something
> else behind this?
>
> Since only x86 is supported, I had a go on ppc64le and with minimal
> effort saw GCC vectorizing more of the base implementation functions
> than are included in the avx version.  Similarly for x86
> micro-architectures.  (I'd need convincing that avx512 is worth the
> frequency reduction.)  It would doubtless be the same on aarch64, say,
> but I only have the POWER.
>
> Thanks for any info.
>

Re: [OMPI users] how to suppress "libibverbs: Warning: couldn't load driver ..." messages?

2021-06-23 Thread Gilles Gouaillardet via users

Hi Jeff,

Assuming you did **not** explicitly configure Open MPI with
--disable-dlopen, you can try
mpirun --mca pml ob1 --mca btl vader,self ...

Cheers,

Gilles

On Thu, Jun 24, 2021 at 5:08 AM Jeff Hammond via users <
users@lists.open-mpi.org> wrote:

> I am running on a single node and do not need any network support.  I am
> using the NVIDIA build of Open-MPI 3.1.5.  How do I tell it to never use
> anything related to IB?  It seems that ^openib is not enough.
>
> Thanks,
>
> Jeff
>
> $ OMP_NUM_THREADS=1
> /proj/nv/Linux_aarch64/21.5/comm_libs/openmpi/openmpi-3.1.5/bin/mpirun
> --mca btl ^openib -n 40
> /local/home/jehammond/NWCHEM/nvhpc-mpi-pr/bin/LINUX64/nwchem
> w12_b3lyp_cc-pvtz_energy.nw | tee
> w12_b3lyp_cc-pvtz_energy.nvhpc-mpi-pr.n40.log
>
> libibverbs: Warning: couldn't load driver 'libvmw_pvrdma-rdmav25.so':
> libvmw_pvrdma-rdmav25.so: cannot open shared object file: No such file or
> directory
> libibverbs: Warning: couldn't load driver 'libsiw-rdmav25.so':
> libsiw-rdmav25.so: cannot open shared object file: No such file or directory
> libibverbs: Warning: couldn't load driver 'librxe-rdmav25.so':
> librxe-rdmav25.so: cannot open shared object file: No such file or directory
> libibverbs: Warning: couldn't load driver 'libqedr-rdmav25.so':
> libqedr-rdmav25.so: cannot open shared object file: No such file or
> directory
> libibverbs: Warning: couldn't load driver 'libmlx5-rdmav25.so':
> libmlx5-rdmav25.so: cannot open shared object file: No such file or
> directory
> libibverbs: Warning: couldn't load driver 'libmlx4-rdmav25.so':
> libmlx4-rdmav25.so: cannot open shared object file: No such file or
> directory
> libibverbs: Warning: couldn't load driver 'libi40iw-rdmav25.so':
> libi40iw-rdmav25.so: cannot open shared object file: No such file or
> directory
> libibverbs: Warning: couldn't load driver 'libhns-rdmav25.so':
> libhns-rdmav25.so: cannot open shared object file: No such file or directory
> libibverbs: Warning: couldn't load driver 'libhfi1verbs-rdmav25.so':
> libhfi1verbs-rdmav25.so: cannot open shared object file: No such file or
> directory
> libibverbs: Warning: couldn't load driver 'libcxgb4-rdmav25.so':
> libcxgb4-rdmav25.so: cannot open shared object file: No such file or
> directory
> libibverbs: Warning: couldn't load driver 'libbnxt_re-rdmav25.so':
> libbnxt_re-rdmav25.so: cannot open shared object file: No such file or
> directory
> libibverbs: Warning: couldn't load driver 'libvmw_pvrdma-rdmav25.so':
> libvmw_pvrdma-rdmav25.so: cannot open shared object file: No such file or
> directory
>
>
>
>

Re: [OMPI users] (Fedora 34, x86_64-pc-linux-gnu, openmpi-4.1.1.tar.gz): PML ucx cannot be selected

2021-05-28 Thread Gilles Gouaillardet via users

Jorge,

pml/ucx used to be selected when no fast interconnect were detected
(since ucx provides driver for both TCP and shared memory).
These providers are now disabled by default, so unless your machine
has a supported fast interconnect (such as Infiniband),
pml/ucx cannot be used out of the box anymore.

if you really want to use pml/ucx on your notebook, you need to
manually re-enable these providers.

That being said, your best choice here is really not to force any pml,
and let Open MPI use pml/ob1
(that has support for both TCP and shared memory)

Cheers,

Gilles

On Sat, May 29, 2021 at 11:19 AM Jorge D'Elia via users
 wrote:
>
> Hi,
>
> We routinely build OpenMPI on x86_64-pc-linux-gnu machines from
> the sources using gcc and usually everything works fine.
>
> In one case we recently installed Fedora 34 from scratch on an
> ASUS G53SX notebook (Intel Core i7-2630QM CPU 2.00GHz ×4 cores,
> without any IB device). Next we build OpenMPI using the file
> openmpi-4.1.1.tar.gz and the GCC 12.0.0 20210524 (experimental)
> compiler.
>
> However, when trying to experiment OpenMPI using UCX
> with a simple test, we get the runtime errors:
>
>   No components were able to be opened in the btl framework.
>   PML ucx cannot be selected
>
> while the test worked fine until Fedora 33 on the same
> machine using the same OpenMPI configuration.
>
> We attach below some info about a simple test run.
>
> Please, any clues where to check or maybe something is missing?
> Thanks in advance.
>
> Regards
> Jorge.
>
> --
> $ cat /proc/version
> Linux version 5.12.7-300.fc34.x86_64 
> (mockbu...@bkernel01.iad2.fedoraproject.org) (gcc (GCC) 11.1.1 20210428 (Red 
> Hat 11.1.1-1), GNU ld version 2.35.1-41.fc34) #1 SMP Wed May 26 12:58:58 UTC 
> 2021
>
> $ mpifort --version
> GNU Fortran (GCC) 12.0.0 20210524 (experimental)
> Copyright (C) 2021 Free Software Foundation, Inc.
>
> $ which mpifort
> /usr/beta/openmpi/bin/mpifort
>
> $ mpifort -o hello_usempi_f08.exe hello_usempi_f08.f90
>
> $ mpirun --mca orte_base_help_aggregate 0 --mca btl self,vader,tcp --map-by 
> node --report-bindings --machinefile ~/machi-openmpi.dat --np 2  
> hello_usempi_f08.exe
> [verne:200650] MCW rank 0 bound to socket 0[core 0[hwt 0]]: [B/././.]
> [verne:200650] MCW rank 1 bound to socket 0[core 1[hwt 0]]: [./B/./.]
> Hello, world, I am  0 of  2: Open MPI v4.1.1, package: Open MPI bigpack@verne 
> Distribution, ident: 4.1.1, repo rev: v4.1.1, Apr 24, 2021
> Hello, world, I am  1 of  2: Open MPI v4.1.1, package: Open MPI bigpack@verne 
> Distribution, ident: 4.1.1, repo rev: v4.1.1, Apr 24, 2021
>
> $ mpirun --mca orte_base_help_aggregate 0 --mca pml ucx --mca btl 
> ^self,vader,tcp --map-by node --report-bindings --machinefile 
> ~/machi-openmpi.dat --np 2  hello_usempi_f08.exe
> [verne:200772] MCW rank 0 bound to socket 0[core 0[hwt 0]]: [B/././.]
> [verne:200772] MCW rank 1 bound to socket 0[core 1[hwt 0]]: [./B/./.]
> --
> No components were able to be opened in the btl framework.
>
> This typically means that either no components of this type were
> installed, or none of the installed components can be loaded.
> Sometimes this means that shared libraries required by these
> components are unable to be found/loaded.
>
>   Host:  verne
>   Framework: btl
> --
> --
> No components were able to be opened in the btl framework.
>
> This typically means that either no components of this type were
> installed, or none of the installed components can be loaded.
> Sometimes this means that shared libraries required by these
> components are unable to be found/loaded.
>
>   Host:  verne
>   Framework: btl
> --
> --
> No components were able to be opened in the pml framework.
>
> This typically means that either no components of this type were
> installed, or none of the installed components can be loaded.
> Sometimes this means that shared libraries required by these
> components are unable to be found/loaded.
>
>   Host:  verne
>   Framework: pml
> --
> [verne:200777] PML ucx cannot be selected
> --
> No components were able to be opened in the pml framework.
>
> This typically means that either no components of this type were
> installed, or none of the installed components can be loaded.
> Sometimes this means that shared libraries required by these
> components are unable to be found/loaded.
>
>   Host:  verne
>   Framework: pml
> --
> [verne:200772] PMIX ERROR:

Re: [OMPI users] [EXTERNAL] Linker errors in Fedora 34 Docker container

2021-05-25 Thread Gilles Gouaillardet via users

Howard,



I have a recollection of a similar issue that only occurs with the 
latest flex (that requires its own library to be passed to the linker).

I cannot remember if this was a flex packaging issue, or if we ended up 
recommending to downgrade flex to

a known to work version.



The issue should not happen if building from an official tarball though.



Cheers,



Gilles

- Original Message -

Hi John,

 

I don’t think an external dependency is going to fix this.

 

In your build area, do you see any .lo files in

 

opal/util/keyval

 

?

 

Which compiler are you using?

 

Also, are you building from the tarballs at 
https://www.open-mpi.org/software/ompi/v4.1/
 ?

 

Howard

 

From: users  on behalf of John 
Haiducek via users 
Reply-To: Open MPI Users 
Date: Tuesday, May 25, 2021 at 3:49 PM
To: "users@lists.open-mpi.org" 
Cc: John Haiducek 
Subject: [EXTERNAL] [OMPI users] Linker errors in Fedora 34 Docker 
container

 

Hi,

 

When attempting to build OpenMPI in a Fedora 34 Docker image I get the 
following linker errors:

 

#22 77.36 make[2]: Entering directory '/build/openmpi-4.1.1/opal/tools/
wrappers'
#22 77.37   CC   opal_wrapper.o
#22 77.67   CCLD opal_wrapper
#22 77.81 /usr/bin/ld: ../../../opal/.libs/libopen-pal.so: undefined 
reference to `opal_util_keyval_yytext'
#22 77.81 /usr/bin/ld: ../../../opal/.libs/libopen-pal.so: undefined 
reference to `opal_util_keyval_yyin'
#22 77.81 /usr/bin/ld: ../../../opal/.libs/libopen-pal.so: undefined 
reference to `opal_util_keyval_yylineno'
#22 77.81 /usr/bin/ld: ../../../opal/.libs/libopen-pal.so: undefined 
reference to `opal_util_keyval_yynewlines'
#22 77.81 /usr/bin/ld: ../../../opal/.libs/libopen-pal.so: undefined 
reference to `opal_util_keyval_yylex'
#22 77.81 /usr/bin/ld: ../../../opal/.libs/libopen-pal.so: undefined 
reference to `opal_util_keyval_parse_done'
#22 77.81 /usr/bin/ld: ../../../opal/.libs/libopen-pal.so: undefined 
reference to `opal_util_keyval_yylex_destroy'
#22 77.81 /usr/bin/ld: ../../../opal/.libs/libopen-pal.so: undefined 
reference to `opal_util_keyval_init_buffer'
#22 77.81 collect2: error: ld returned 1 exit status
My configure command is just ./configure --prefix=/usr/local/openmpi.

I also tried ./configure --prefix=/usr/local/openmpi --disable-silent-
rules --enable-builtin-atomics --with-hwloc=/usr --with-libevent=
external --with-pmix=external --with-valgrind (similar to what is in the 
Fedora spec file for OpenMPI) but that produces the same errors.

 

Is there a third-party library I need to install or an additional 
configure option I can set that will fix these?

 

John

Re: [OMPI users] Stable and performant openMPI version for Ubuntu20.04?

2021-04-08 Thread Gilles Gouaillardet via users

Are you using gcc provided by Ubuntu 20.04?
if not which compiler (vendor and version) are you using?

My (light) understanding is that this patch should not impact 
performances, so I am not
sure whether the performance being back is something I do not understand,
 or the side effect
of a compiler bug.

Anyway, I issued https://github.com/open-mpi/ompi/pull/8789 and asked 
for a review.

Cheers,

Gilles

- Original Message -
> Dear Gilles,
> As per your suggestion, I tried the inline patch 
as discussed in 
https://github.com/open-mpi/ompi/pull/8622#issuecomment-800776864
 .
> 
> This has fixed the regression completely for the remaining test cases 
in FFTW MPI in-built test bench - which was persisting even after using 
the git patch 
https://patch-diff.githubusercontent.com/raw/open-mpi/ompi/pull/8623.patch
 as merged by you.
> So, it seems there is a performance difference between asm volatile("":
 : :"memory"); and __atomic_thread_fence (__ATOMIC_ACQUIRE) on x86_64.
> 
> I would request you to please make this change and merge it to 
respective openMPI branches - please intimate if possible whenever that 
takes place.
> I also request you to plan for an early 4.1.1rc2 release at least by 
June 2021.
> 
> With Regards,
> S. Biplab Raut 
> 
> -Original Message-
> From: Gilles Gouaillardet  
> Sent: Thursday, April 1, 2021 8:31 AM
> To: Raut, S Biplab 
> Subject: Re: [OMPI users] Stable and performant openMPI version for 
Ubuntu20.04 ?
> 
> [CAUTION: External Email]
> 
> I really had no time to investigate this.
> 
> A quick test is to apply the patch in the inline comment at
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fopen-mpi%2Fompi%2Fpull%2F8622%23issuecomment-800776864data=04%7C01%7CBiplab.Raut%40amd.com%7C6b277b24afa04650c86c08d8f4ba5dc7%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637528428572315404%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=4PBWIZsFdyBO2gUbYURh9iDwQxMdM%2FUfQV4%2Bg%2Farnh0%3Dreserved=0
 and see whether it helps.
> 
> If not, I would recommend you try Open MPI 3.1.6 (after manually 
applying 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fopen-mpi%2Fompi%2Fpull%2F8624.patchdata=04%7C01%7CBiplab.Raut%40amd.com%7C6b277b24afa04650c86c08d8f4ba5dc7%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637528428572315404%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=yZbu1dDcC1awpiuclvyso9HANqAHIEn4p1pT862n4LY%3Dreserved=0)
 and see whether there is a performance regression between 3.1.1 and (
patched) 3.1.6
> 
> Cheers,
> 
> Gilles
> 
> On Thu, Apr 1, 2021 at 11:25 AM Raut, S Biplab  
wrote:
> >
> > Dear Gilles,
> >  Did you get a chance to look into my below mail 
content?
> > I find the regression is not completely fixed.
> >
> > With Regards,
> > S. Biplab Raut
> >
> > -Original Message-
> > From: Raut, S Biplab
> > Sent: Wednesday, March 24, 2021 11:32 PM
> > To: Gilles Gouaillardet 
> > Subject: RE: [OMPI users] Stable and performant openMPI version for 
Ubuntu20.04 ?
> >
> > Dear Gilles,
> > After applying the below patch, I thoroughly 
tested various test cases of FFTW using its in-built benchmark test 
program.
> > Many of the test cases, that showed regression previously as 
compared to openMPI3.1.1, have now improved with positive gains.
> > However, there are still few test cases where the performance is 
lower than openMPI3.1.1.
> > Are there more performance issues in openMPI4.x that need to be 
discovered?
> >
> > Please check the below details.
> >
> > 1) For problem size 1024x1024x512 :-
> >  $   mpirun --map-by core --rank-by core --bind-to core  ./fftw/
mpi/mpi-bench -opatient -r500 -s dcif1024x1024x512
> >  openMPI3.3.1_stock performance -> 147 MFLOPS
> >  openMPI4.1.0_stock performance -> 137 MFLOPS
> >  openMPI4.1.0_patch performance -> 137 MFLOPS
> > 2) For problem size 512x512x512 :-
> >  $   mpirun --map-by core --rank-by core --bind-to core  ./fftw/
mpi/mpi-bench -opatient -r500 -s dcif512x512x512
> >  openMPI3.3.1_stock performance -> 153  MFLOPS
> >  openMPI4.1.0_stock performance -> 144 MFLOPS
> >  openMPI4.1.0_patch performance -> 147 MFLOPS
> >
> > With Regards,
> > S. Biplab Rsut
> >
> > -Original Message-
> > From: Gilles Gouaillardet 
> > Sent: Wednesday, March 17, 2021 11:14 AM
> > To: Raut, S Biplab 
> > Subject: Re: [OMPI users] Stable and performant openMPI version fo

Re: [OMPI users] Building Open-MPI with Intel C

2021-04-07 Thread Gilles Gouaillardet via users

Michael,

orted is able to find its dependencies to the Intel runtime on the
host where you sourced the environment.
However, it is unlikely able to do it on a remote host
For example
ssh ... ldd `which opted`
will likely fail.

An option is to use -rpath (and add the path to the Intel runtime).
IIRC, there is also an option in the Intel compiler to statically link
to the runtime.

Cheers,

Gilles

On Wed, Apr 7, 2021 at 9:00 AM Heinz, Michael William via users
 wrote:
>
> I’m having a heck of a time building OMPI with Intel C. Compilation goes 
> fine, installation goes fine, compiling test apps (the OSU benchmarks) goes 
> fine…
>
>
>
> but when I go to actually run an MPI app I get:
>
>
>
> [awbp025:~/work/osu-icc](N/A)$ /usr/mpi/icc/openmpi-icc/bin/mpirun -np 2 -H 
> awbp025,awbp026,awbp027,awbp028 -x FI_PROVIDER=opa1x -x 
> LD_LIBRARY_PATH=/usr/mpi/icc/openmpi-icc/lib64:/lib hostname
>
> /usr/mpi/icc/openmpi-icc/bin/orted: error while loading shared libraries: 
> libimf.so: cannot open shared object file: No such file or directory
>
> /usr/mpi/icc/openmpi-icc/bin/orted: error while loading shared libraries: 
> libimf.so: cannot open shared object file: No such file or directory
>
>
>
> Looking at orted, it does seem like the binary is linking correctly:
>
>
>
> [awbp025:~/work/osu-icc](N/A)$ /usr/mpi/icc/openmpi-icc/bin/orted
>
> [awbp025:620372] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file 
> ess_env_module.c at line 135
>
> [awbp025:620372] [[INVALID],INVALID] ORTE_ERROR_LOG: Bad parameter in file 
> util/session_dir.c at line 107
>
> [awbp025:620372] [[INVALID],INVALID] ORTE_ERROR_LOG: Bad parameter in file 
> util/session_dir.c at line 346
>
> [awbp025:620372] [[INVALID],INVALID] ORTE_ERROR_LOG: Bad parameter in file 
> base/ess_base_std_orted.c at line 264
>
> --
>
> It looks like orte_init failed for some reason; your parallel process is
>
> likely to abort.  There are many reasons that a parallel process can
>
> fail during orte_init; some of which are due to configuration or
>
> environment problems.  This failure appears to be an internal failure;
>
> here's some additional information (which may only be relevant to an
>
> Open MPI developer):
>
>
>
>   orte_session_dir failed
>
>   --> Returned value Bad parameter (-5) instead of ORTE_SUCCESS
>
> --
>
>
>
> and…
>
>
>
> [awbp025:~/work/osu-icc](N/A)$ ldd /usr/mpi/icc/openmpi-icc/bin/orted
>
> linux-vdso.so.1 (0x7fffc2ebf000)
>
> libopen-rte.so.40 => /usr/mpi/icc/openmpi-icc/lib/libopen-rte.so.40 
> (0x7fdaa6404000)
>
> libopen-pal.so.40 => /usr/mpi/icc/openmpi-icc/lib/libopen-pal.so.40 
> (0x7fdaa60bd000)
>
> libopen-orted-mpir.so => 
> /usr/mpi/icc/openmpi-icc/lib/libopen-orted-mpir.so (0x7fdaa5ebb000)
>
> libm.so.6 => /lib64/libm.so.6 (0x7fdaa5b39000)
>
> librt.so.1 => /lib64/librt.so.1 (0x7fdaa5931000)
>
> libutil.so.1 => /lib64/libutil.so.1 (0x7fdaa572d000)
>
> libz.so.1 => /lib64/libz.so.1 (0x7fdaa5516000)
>
> libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x7fdaa52fe000)
>
> libpthread.so.0 => /lib64/libpthread.so.0 (0x7fdaa50de000)
>
> libc.so.6 => /lib64/libc.so.6 (0x7fdaa4d1b000)
>
> libdl.so.2 => /lib64/libdl.so.2 (0x7fdaa4b17000)
>
> libimf.so => 
> /opt/intel/compilers_and_libraries_2020.4.304/linux/compiler/lib/intel64_lin/libimf.so
>  (0x7fdaa4494000)
>
> libsvml.so => 
> /opt/intel/compilers_and_libraries_2020.4.304/linux/compiler/lib/intel64_lin/libsvml.so
>  (0x7fdaa29c4000)
>
> libirng.so => 
> /opt/intel/compilers_and_libraries_2020.4.304/linux/compiler/lib/intel64_lin/libirng.so
>  (0x7fdaa2659000)
>
> libintlc.so.5 => 
> /opt/intel/compilers_and_libraries_2020.4.304/linux/compiler/lib/intel64_lin/libintlc.so.5
>  (0x7fdaa23e1000)
>
> /lib64/ld-linux-x86-64.so.2 (0x7fdaa66d6000)
>
>
>
> Can anyone suggest what I’m forgetting to do?
>
>
>
> ---
>
> Michael Heinz
> Fabric Software Engineer, Cornelis Networks
>
>

Re: [OMPI users] HWLOC icc error

2021-03-23 Thread Gilles Gouaillardet via users

Luis,

this file is never compiled when an external hwloc is used.

Please open a github issue and include all the required information


Cheers,

Gilles

On Tue, Mar 23, 2021 at 5:44 PM Luis Cebamanos via users
 wrote:
>
> Hello,
>
> Compiling OpenMPI 4.0.5 with Intel 2020 I came across this error. Has
> anyone seen this before? I have tried with internal and external HWLOC
> with the same outcome.
>
>
> CC   net.lo
> In file included from ../../opal/mca/hwloc/hwloc201/hwloc201.h(26),
>  from ../../opal/mca/hwloc/hwloc-internal.h(131),
>  from ../../opal/util/proc.h(22),
>  from error.c(36):
> ../../opal/mca/hwloc/hwloc201/hwloc/include/hwloc.h(56): catastrophic
> error: cannot open source file "hwloc/autogen/config.h"
>   #include 
>^
>
> compilation aborted for error.c (code 4)
>
>
> Regards,
> Luis

Re: [OMPI users] [External] Help with MPI and macOS Firewall

2021-03-18 Thread Gilles Gouaillardet via users

Matt,

you can either

mpirun --mca btl self,vader ...

or

export OMPI_MCA_btl=self,vader
mpirun ...

you may also add
btl = self,vader
in your /etc/openmpi-mca-params.conf
and then simply

mpirun ...

Cheers,

Gilles

On Fri, Mar 19, 2021 at 5:44 AM Matt Thompson via users
 wrote:
>
> Prentice,
>
> Ooh. The first one seems to work. The second one apparently is not liked by 
> zsh and I had to do:
> ❯ mpirun -mca btl '^tcp' -np 6 ./helloWorld.mpi3.exe
> Compiler Version: GCC version 10.2.0
> MPI Version: 3.1
> MPI Library Version: Open MPI v4.1.0, package: Open MPI 
> mathomp4@gs6101-parcel.local Distribution, ident: 4.1.0, repo rev: v4.1.0, 
> Dec 18, 2020
>
> Next question: is this:
>
> OMPI_MCA_btl='self,vader'
>
> the right environment variable translation of that command-line option?
>
> On Thu, Mar 18, 2021 at 3:40 PM Prentice Bisbal via users 
>  wrote:
>>
>> OpenMPI should only be using shared memory on the local host automatically, 
>> but maybe you need to force it.
>>
>> I think
>>
>> mpirun -mca btl self,vader ...
>>
>> should do that.
>>
>> or you can exclude tcp instead
>>
>> mpirun -mca btl ^tcp
>>
>> See
>>
>> https://www.open-mpi.org/faq/?category=sm
>>
>> for more info.
>>
>> Prentice
>>
>> On 3/18/21 12:28 PM, Matt Thompson via users wrote:
>>
>> All,
>>
>> This isn't specifically an Open MPI issue, but as that is the MPI stack I 
>> use on my laptop, I'm hoping someone here might have a possible solution. (I 
>> am pretty sure something like MPICH would trigger this as well.)
>>
>> Namely, my employer recently did something somewhere so that now *any* MPI 
>> application I run will throw popups like this one:
>>
>> https://user-images.githubusercontent.com/4114656/30962814-866f3010-a44b-11e7-9de3-9f2a3b0229c0.png
>>
>> though for me it's asking about "orterun" and "helloworld.mpi3.exe", etc. I 
>> essentially get one-per-process.
>>
>> If I had sudo access, I suppose I could just keep clicking "Allow" for every 
>> program, but I don't and I compile lots of programs with different names.
>>
>> So, I was hoping maybe an Open MPI guru out there knew of an MCA thing I 
>> could use to avoid them? This is all isolated on-my-laptop MPI I'm doing, so 
>> at most an "mpirun --oversubscribe -np 12" or something. It'll never go over 
>> my network to anything, etc.
>>
>> --
>> Matt Thompson
>>“The fact is, this is about us identifying what we do best and
>>finding more ways of doing less of it better” -- Director of Better Anna 
>> Rampton
>
>
>
> --
> Matt Thompson
>“The fact is, this is about us identifying what we do best and
>finding more ways of doing less of it better” -- Director of Better Anna 
> Rampton

Re: [OMPI users] config: gfortran: "could not run a simple Fortran program"

2021-03-07 Thread Gilles Gouaillardet via users

Anthony,

Did you make sure you can compile a simple fortran program with
gfortran? and gcc?

Please compress and attach both openmpi-config.out and config.log, so
we can diagnose the issue.

Cheers,

Gilles

On Mon, Mar 8, 2021 at 6:48 AM Anthony Rollett via users
 wrote:
>
> I am trying to configure v 4.1 with the following, which fails as noted in 
> the Subject line.
>
> ./configure --prefix=/Users/Shared/openmpi410 \
> FC=gfortran CC=clang CXX=c++ --disable-static \
> 2>&1 | tee openmpi-config.out
>
> On a 2019 MacbookPro with 10.15 (but I had the same problem with 10.14).
> Gfortran (and gcc) is from High Performance Computing for OSX
>
> Any clues will be gratefully received! And I apologize if this is a solved 
> problem ...
> Many thanks, Tony Rollett
> PS.  If I try “CC=gcc CXX=g++” then it fails at the C compilation stage.

Re: [OMPI users] Stable and performant openMPI version for Ubuntu20.04 ?

2021-03-04 Thread Gilles Gouaillardet via users

On top of XPMEM, try to also force btl/vader with
mpirun --mca pml ob1 --mca btl vader,self, ...

On Fri, Mar 5, 2021 at 8:37 AM Nathan Hjelm via users
 wrote:
>
> I would run the v4.x series and install xpmem if you can 
> (http://github.com/hjelmn/xpmem). You will need to build with 
> —with-xpmem=/path/to/xpmem to use xpmem otherwise vader will default to using 
> CMA. This will provide the best possible performance.
>
> -Nathan
>
> On Mar 4, 2021, at 5:55 AM, Raut, S Biplab via users 
>  wrote:
>
> [AMD Official Use Only - Internal Distribution Only]
>
> It is a single node execution, so it should be using shared memory (vader).
>
> With Regards,
> S. Biplab Raut
>
> From: Heinz, Michael William 
> Sent: Thursday, March 4, 2021 5:17 PM
> To: Open MPI Users 
> Cc: Raut, S Biplab 
> Subject: Re: [OMPI users] Stable and performant openMPI version for 
> Ubuntu20.04 ?
>
> [CAUTION: External Email]
>
> What interconnect are you using at run time? That is, are you using Ethernet 
> or InfiniBand or Omnipath?
>
> Sent from my iPad
>
>
>
> On Mar 4, 2021, at 5:05 AM, Raut, S Biplab via users 
>  wrote:
>
> 
> [AMD Official Use Only - Internal Distribution Only]
>
> After downloading a particular openMPI version, let’s say v3.1.1 from 
> https://download.open-mpi.org/release/open-mpi/v3.1/openmpi-3.1.1.tar.gz , I 
> follow the below steps.
> ./configure --prefix="$INSTALL_DIR" --enable-mpi-fortran --enable-mpi-cxx 
> --enable-shared=yes --enable-static=yes --enable-mpi1-compatibility
>   make -j
>   make install
>   export PATH=$INSTALL_DIR/bin:$PATH
>   export LD_LIBRARY_PATH=$INSTALL_DIR/lib:$LD_LIBRARY_PATH
> Additionally, I also install libnuma-dev on the machine.
>
> For all the machines having Ubuntu 18.04 and 19.04, it works correctly and 
> results in expected performance/GFLOPS.
> But, when OS is changed to Ubuntu 20.04, then I start getting the issues as 
> mentioned in my original/previous mail below.
>
> With Regards,
> S. Biplab Raut
>
> From: users  On Behalf Of John Hearns via 
> users
> Sent: Thursday, March 4, 2021 1:53 PM
> To: Open MPI Users 
> Cc: John Hearns 
> Subject: Re: [OMPI users] Stable and performant openMPI version for 
> Ubuntu20.04 ?
>
> [CAUTION: External Email]
> How are you installing the OpenMPI versions? Are you using packages which are 
> distributed by the OS?
>
> It might be worth looking at using Easybuid or Spack
> https://docs.easybuild.io/en/latest/Introduction.html
> https://spack.readthedocs.io/en/latest/
>
>
> On Thu, 4 Mar 2021 at 07:35, Raut, S Biplab via users 
>  wrote:
>
> [AMD Official Use Only - Internal Distribution Only]
>
> Dear Experts,
> Until recently, I was using openMPI3.1.1 to run 
> single node 128 ranks MPI application on Ubuntu18.04 and Ubuntu19.04.
> But, now the OS on these machines are upgraded to Ubuntu20.04, and I have 
> been observing program hangs with openMPI3.1.1 version.
> So, I tried with openMPI4.0.5 version – The program ran properly without any 
> issues but there is a performance regression in my application.
>
> Can I know the stable openMPI version recommended for Ubuntu20.04 that has no 
> known regression compared to v3.1.1.
>
> With Regards,
> S. Biplab Raut
>
>

Re: [OMPI users] MPI executable fails on ArchLinux on Termux

2021-02-25 Thread Gilles Gouaillardet via users


yes, you need to (re)build Open MPI from source in order to try this trick.

On 2/26/2021 3:55 PM, LINUS FERNANDES via users wrote:

No change.
What do you mean by running configure?
Are you expecting me to build OpenMPI from source?

On Fri, 26 Feb 2021, 11:16 Gilles Gouaillardet via users, 
mailto:users@lists.open-mpi.org>> wrote:


Before running configure, try to
export ac_cv_type_struct_ifreq=no
and see how it goes

On Fri, Feb 26, 2021 at 8:18 AM LINUS FERNANDES via users
mailto:users@lists.open-mpi.org>> wrote:
>
> https://github.com/SDRausty/termux-archlinux/issues/78
<https://github.com/SDRausty/termux-archlinux/issues/78>
> On Fri, 26 Feb 2021, 04:28 LINUS FERNANDES,
mailto:linus.fernan...@gmail.com>> wrote:
>>
>> ifconfig on Termux
>>
>> dummy0: flags=195 mtu 1500
>>         inet6 fe80::488e:42ff:fe43:b843  prefixlen 64  scopeid
0x20
>>         unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 
txqueuelen 0 (UNSPEC)
>>         RX packets 0  bytes 0 (0.0 B)
>>         RX errors 0  dropped 0  overruns 0  frame 0
>>         TX packets 3  bytes 210 (210.0 B)
>>         TX errors 0  dropped 0 overruns 0  carrier 0 collisions 0
>>
>> lo: flags=73  mtu 65536
>>         inet 127.0.0.1  netmask 255.0.0.0
>>         inet6 ::1  prefixlen 128  scopeid 0x10
>>         unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 
txqueuelen 0 (UNSPEC)
>>         RX packets 12  bytes 1285 (1.2 KiB)
>>         RX errors 0  dropped 0  overruns 0  frame 0
>>         TX packets 12  bytes 1285 (1.2 KiB)
>>         TX errors 0  dropped 0 overruns 0  carrier 0 collisions 0
>>
>> p2p0: flags=4099  mtu 1500
>>         unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 
txqueuelen 1000  (UNSPEC)
>>         RX packets 0  bytes 0 (0.0 B)
>>         RX errors 0  dropped 0  overruns 0  frame 0
>>         TX packets 0  bytes 0 (0.0 B)
>>         TX errors 0  dropped 0 overruns 0  carrier 0 collisions 0
>>
>> rmnet_data0: flags=65  mtu 1500
>>         inet 10.132.157.153  netmask 255.255.255.252
>>         inet6 fe80::fbbc:50e0:b07d:6380  prefixlen 64  scopeid
0x20
>>         unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 
txqueuelen 1000  (UNSPEC)
>>         RX packets 3041  bytes 1828676 (1.7 MiB)
>>         RX errors 0  dropped 0  overruns 0  frame 0
>>         TX packets 3079  bytes 794069 (775.4 KiB)
>>         TX errors 0  dropped 0 overruns 0  carrier 0 collisions 0
>>
>> rmnet_data7: flags=65  mtu 2000
>>         inet6 fe80::e516:d4c5:e5f7:e54e  prefixlen 64  scopeid
0x20
>>         unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 
txqueuelen 1000  (UNSPEC)
>>         RX packets 8  bytes 620 (620.0 B)
>>         RX errors 0  dropped 0  overruns 0  frame 0
>>         TX packets 10  bytes 752 (752.0 B)
>>         TX errors 0  dropped 0 overruns 0  carrier 0 collisions 0
>>
>> rmnet_ipa0: flags=65  mtu 2000
>>         unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 
txqueuelen 1000  (UNSPEC)
>>         RX packets 1926  bytes 1865884 (1.7 MiB)
>>         RX errors 0  dropped 0  overruns 0  frame 0
>>         TX packets 3089  bytes 794821 (776.1 KiB)
>>         TX errors 0  dropped 0 overruns 0  carrier 0 collisions 0
>>
>> wlan0: flags=4099  mtu 1500
>>         unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 
txqueuelen 1000  (UNSPEC)
>>         RX packets 0  bytes 0 (0.0 B)
>>         RX errors 0  dropped 0  overruns 0  frame 0
>>         TX packets 0  bytes 0 (0.0 B)
>>         TX errors 0  dropped 0 overruns 0  carrier 0 collisions 0
>>
>>
>> ipaddr on Termux:
>>
>> 1: lo:  mtu 65536 qdisc noqueue state
UNKNOWN group default
>>     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>>     inet 127.0.0.1/8 <http://127.0.0.1/8> scope host lo
>>        valid_lft forever preferred_lft forever
>>     inet6 ::1/128 scope host
>>        valid_lft forever preferred_lft forever
>> 2: dummy0:  mtu 1500 qdisc noqueue
state UNKNOWN group default
>>     link/ether 4a:8e:42:43:b8:43 brd ff:ff:ff:ff:ff:ff
>>     inet6 fe80::488e:42ff:fe43:b843/64 scope link
>>        valid_lft forever preferred_lft forever
>> 3: sit0@NONE:  mtu 1480 qdisc noop state DOWN gro

Re: [OMPI users] MPI executable fails on ArchLinux on Termux

2021-02-25 Thread Gilles Gouaillardet via users

a0
>>>>valid_lft forever preferred_lft forever
>>>> inet6 fe80::93a5:ad99:4660:adc4/64 scope link
>>>>valid_lft forever preferred_lft forever
>>>>
>>>> Errno==13 is EACCESS, which generically translates to "permission denied". 
>>>>  Since you're running as root, this suggests that something outside of 
>>>> your local environment (e.g., outside of that immediate layer of 
>>>> virtualization) is preventing Open MPI from making that 
>>>> ioctl(SIOCGIFHWADDR) call (all that call is trying to do is discover the 
>>>> MAC address of that interface).
>>>>
>>>> Indeed, it looks like rmnet_data0 somehow doesn't have a MAC address...?
>>>>
>>>> rmnet_data0: flags=65 mtu 1500
>>>> inet 10.140.58.138 netmask 255.255.255.252
>>>> inet6 fe80::93a5:ad99:4660:adc4 prefixlen 64 scopeid 0x20
>>>> unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 txqueuelen 
>>>> 1000 (UNSPEC)
>>>> RX packets 416796 bytes 376287723 (358.8 MiB)
>>>> RX errors 0 dropped 0 overruns 0 frame 0
>>>> TX packets 318293 bytes 69933666 (66.6 MiB)
>>>> TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
>>>>
>>>> That's... weird.  I don't know the details of this network stack; it's 
>>>> somewhat outside the bounds of "normal" IP-based networking if there's no 
>>>> MAC address.  As such, it doesn't surprise me that -- given that one of 
>>>> Open MPI's core assumptions fails -- Open MPI fails / refuses to run.
>>>>
>>>> I don't know if anyone has tried to run Open MPI in such a virtualized 
>>>> environment before.
>>>>
>>>>
>>>>
>>>>
>>>> On Feb 25, 2021, at 6:04 AM, LINUS FERNANDES via users 
>>>>  wrote:
>>>>
>>>> So the OpenMPI version on Arch Linux can't be made operational?
>>>>
>>>> On Thu, 25 Feb 2021, 15:43 LINUS FERNANDES,  
>>>> wrote:
>>>>>
>>>>> Nope. None of the commands exist. So no, I'd say.
>>>>>
>>>>> On Thu, 25 Feb 2021, 15:11 Gilles Gouaillardet via users, 
>>>>>  wrote:
>>>>>>
>>>>>> https://www.letmegooglethat.com/?q=how+to+check+if+selinux+is+enabled=1
>>>>>>
>>>>>> On Thu, Feb 25, 2021 at 6:15 PM LINUS FERNANDES via users
>>>>>>  wrote:
>>>>>> >
>>>>>> > How do I know that? I'm not a Linux expert. I simply want to get 
>>>>>> > OpenMPI running on Arch Linux so that I can test out their Java 
>>>>>> > wrappers which I obviously can't on Termux since it doesn't support 
>>>>>> > OpenJDK.
>>>>>> >
>>>>>> > On Thu, 25 Feb 2021, 13:37 Gilles Gouaillardet via users, 
>>>>>> >  wrote:
>>>>>> >>
>>>>>> >> Is SELinux running on ArchLinux under Termux?
>>>>>> >>
>>>>>> >> On 2/25/2021 4:36 PM, LINUS FERNANDES via users wrote:
>>>>>> >> > Yes, I did not receive this in my inbox since I set to receive 
>>>>>> >> > digest.
>>>>>> >> >
>>>>>> >> > 
>>>>>> >> > ifconfig output:
>>>>>> >> >
>>>>>> >> > dummy0: flags=195 mtu 1500
>>>>>> >> > inet6 fe80::38a0:1bff:fe81:d4f5 prefixlen 64 scopeid 
>>>>>> >> > 0x20
>>>>>> >> > unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
>>>>>> >> > txqueuelen 0 (UNSPEC)
>>>>>> >> > RX packets 0 bytes 0 (0.0 B)
>>>>>> >> > RX errors 0 dropped 0 overruns 0 frame 0
>>>>>> >> > TX packets 3 bytes 210 (210.0 B)
>>>>>> >> > TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
>>>>>> >> >
>>>>>> >> > lo: flags=73 mtu 65536
>>>>>> >> > inet 127.0.0.1 netmask 255.0.0.0
>>>>>> >> > inet6 ::1 prefixlen 128 scopeid 0x10
>>>>>> >> > unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
>>>>&

Re: [OMPI users] MPI executable fails on ArchLinux on Termux

2021-02-25 Thread Gilles Gouaillardet via users

https://www.letmegooglethat.com/?q=how+to+check+if+selinux+is+enabled=1

On Thu, Feb 25, 2021 at 6:15 PM LINUS FERNANDES via users
 wrote:
>
> How do I know that? I'm not a Linux expert. I simply want to get OpenMPI 
> running on Arch Linux so that I can test out their Java wrappers which I 
> obviously can't on Termux since it doesn't support OpenJDK.
>
> On Thu, 25 Feb 2021, 13:37 Gilles Gouaillardet via users, 
>  wrote:
>>
>> Is SELinux running on ArchLinux under Termux?
>>
>> On 2/25/2021 4:36 PM, LINUS FERNANDES via users wrote:
>> > Yes, I did not receive this in my inbox since I set to receive digest.
>> >
>> > 
>> > ifconfig output:
>> >
>> > dummy0: flags=195 mtu 1500
>> > inet6 fe80::38a0:1bff:fe81:d4f5 prefixlen 64 scopeid 0x20
>> > unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
>> > txqueuelen 0 (UNSPEC)
>> > RX packets 0 bytes 0 (0.0 B)
>> > RX errors 0 dropped 0 overruns 0 frame 0
>> > TX packets 3 bytes 210 (210.0 B)
>> > TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
>> >
>> > lo: flags=73 mtu 65536
>> > inet 127.0.0.1 netmask 255.0.0.0
>> > inet6 ::1 prefixlen 128 scopeid 0x10
>> > unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
>> > txqueuelen 0 (UNSPEC)
>> > RX packets 17247 bytes 2062939 (1.9 MiB)
>> > RX errors 0 dropped 0 overruns 0 frame 0
>> > TX packets 17247 bytes 2062939 (1.9 MiB)
>> > TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
>> >
>> > p2p0: flags=4099 mtu 1500
>> > unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
>> > txqueuelen 1000 (UNSPEC)
>> > RX packets 0 bytes 0 (0.0 B)
>> > RX errors 0 dropped 0 overruns 0 frame 0
>> > TX packets 0 bytes 0 (0.0 B)
>> > TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
>> >
>> > rmnet_data0: flags=65 mtu 1500
>> > inet 10.140.58.138 netmask 255.255.255.252
>> > inet6 fe80::93a5:ad99:4660:adc4 prefixlen 64 scopeid 0x20
>> > unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
>> > txqueuelen 1000 (UNSPEC)
>> > RX packets 416796 bytes 376287723 (358.8 MiB)
>> > RX errors 0 dropped 0 overruns 0 frame 0
>> > TX packets 318293 bytes 69933666 (66.6 MiB)
>> > TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
>> >
>> > rmnet_data7: flags=65 mtu 2000
>> > inet6 fe80::a6b7:c914:44de:639 prefixlen 64 scopeid 0x20
>> > unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
>> > txqueuelen 1000 (UNSPEC)
>> > RX packets 8 bytes 620 (620.0 B)
>> > RX errors 0 dropped 0 overruns 0 frame 0
>> > TX packets 10 bytes 752 (752.0 B)
>> > TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
>> >
>> > rmnet_ipa0: flags=65 mtu 2000
>> > unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
>> > txqueuelen 1000 (UNSPEC)
>> > RX packets 222785 bytes 381290027 (363.6 MiB)
>> > RX errors 0 dropped 0 overruns 0 frame 0
>> > TX packets 318303 bytes 69934418 (66.6 MiB)
>> > TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
>> >
>> > wlan0: flags=4099 mtu 1500
>> > unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
>> > txqueuelen 1000 (UNSPEC)
>> > RX packets 650238 bytes 739939859 (705.6 MiB)
>> > RX errors 0 dropped 0 overruns 0 frame 0
>> > TX packets 408284 bytes 63728624 (60.7 MiB)
>> > TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
>> >
>> >
>> > -
>> > ip addr output:
>> >
>> > 1: lo:  mtu 65536 qdisc noqueue state UNKNOWN
>> > group default
>> > link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>> > inet 127.0.0.1/8 <http://127.0.0.1/8> scope host lo
>> >valid_lft forever preferred_lft forever
>> > inet6 ::1/128 scope host
>> >valid_lft forever preferred_lft forever
>> > 2: dummy0:  mtu 1500 qdisc noqueue state
>> > UNKNOWN group default
>> > link/ether 3a:a0:1b:81:d4:f5 brd ff:ff:ff:ff:ff:ff
>> > inet6 fe80::38a0:1bff:fe81:d4f5/64 scope link
>> >valid_lft forever preferred_lft forever
>> > 3: sit0@NONE:  mt

Re: [OMPI users] MPI executable fails on ArchLinux on Termux

2021-02-25 Thread Gilles Gouaillardet via users


Is SELinux running on ArchLinux under Termux?

On 2/25/2021 4:36 PM, LINUS FERNANDES via users wrote:

Yes, I did not receive this in my inbox since I set to receive digest.


ifconfig output:

dummy0: flags=195 mtu 1500
        inet6 fe80::38a0:1bff:fe81:d4f5 prefixlen 64 scopeid 0x20
        unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 
txqueuelen 0 (UNSPEC)

        RX packets 0 bytes 0 (0.0 B)
        RX errors 0 dropped 0 overruns 0 frame 0
        TX packets 3 bytes 210 (210.0 B)
        TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

lo: flags=73 mtu 65536
        inet 127.0.0.1 netmask 255.0.0.0
        inet6 ::1 prefixlen 128 scopeid 0x10
        unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 
txqueuelen 0 (UNSPEC)

        RX packets 17247 bytes 2062939 (1.9 MiB)
        RX errors 0 dropped 0 overruns 0 frame 0
        TX packets 17247 bytes 2062939 (1.9 MiB)
        TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

p2p0: flags=4099 mtu 1500
        unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 
txqueuelen 1000 (UNSPEC)

        RX packets 0 bytes 0 (0.0 B)
        RX errors 0 dropped 0 overruns 0 frame 0
        TX packets 0 bytes 0 (0.0 B)
        TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

rmnet_data0: flags=65 mtu 1500
        inet 10.140.58.138 netmask 255.255.255.252
        inet6 fe80::93a5:ad99:4660:adc4 prefixlen 64 scopeid 0x20
        unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 
txqueuelen 1000 (UNSPEC)

        RX packets 416796 bytes 376287723 (358.8 MiB)
        RX errors 0 dropped 0 overruns 0 frame 0
        TX packets 318293 bytes 69933666 (66.6 MiB)
        TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

rmnet_data7: flags=65 mtu 2000
        inet6 fe80::a6b7:c914:44de:639 prefixlen 64 scopeid 0x20
        unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 
txqueuelen 1000 (UNSPEC)

        RX packets 8 bytes 620 (620.0 B)
        RX errors 0 dropped 0 overruns 0 frame 0
        TX packets 10 bytes 752 (752.0 B)
        TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

rmnet_ipa0: flags=65 mtu 2000
        unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 
txqueuelen 1000 (UNSPEC)

        RX packets 222785 bytes 381290027 (363.6 MiB)
        RX errors 0 dropped 0 overruns 0 frame 0
        TX packets 318303 bytes 69934418 (66.6 MiB)
        TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

wlan0: flags=4099 mtu 1500
        unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 
txqueuelen 1000 (UNSPEC)

        RX packets 650238 bytes 739939859 (705.6 MiB)
        RX errors 0 dropped 0 overruns 0 frame 0
        TX packets 408284 bytes 63728624 (60.7 MiB)
        TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0


-
ip addr output:

1: lo:  mtu 65536 qdisc noqueue state UNKNOWN 
group default

    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8  scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: dummy0:  mtu 1500 qdisc noqueue state 
UNKNOWN group default

    link/ether 3a:a0:1b:81:d4:f5 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::38a0:1bff:fe81:d4f5/64 scope link
       valid_lft forever preferred_lft forever
3: sit0@NONE:  mtu 1480 qdisc noop state DOWN group default
    link/sit 0.0.0.0 brd 0.0.0.0
4: rmnet_ipa0:  mtu 2000 qdisc pfifo_fast state UNKNOWN 
group default qlen 1000

    link/[530]
5: rmnet_data0:  mtu 1500 qdisc htb state UNKNOWN group 
default qlen 1000

    link/[530]
    inet 10.140.58.138/30  scope global 
rmnet_data0

       valid_lft forever preferred_lft forever
    inet6 fe80::93a5:ad99:4660:adc4/64 scope link
       valid_lft forever preferred_lft forever
6: rmnet_data1: <> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/[530]
7: rmnet_data2: <> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/[530]
8: rmnet_data3: <> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/[530]
9: rmnet_data4: <> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/[530]
10: rmnet_data5: <> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/[530]
11: rmnet_data6: <> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/[530]
12: rmnet_data7:  mtu 2000 qdisc htb state UNKNOWN group 
default qlen 1000

    link/[530]
    inet6 fe80::a6b7:c914:44de:639/64 scope link
       valid_lft forever preferred_lft forever
13: r_rmnet_data0: <> mtu 1500 qdisc noop state DOWN group default 
qlen 1000

    link/[530]
14: r_rmnet_data1: <> mtu 1500 qdisc noop state DOWN group default 
qlen 1000

    link/[530]
15: r_rmnet_data2: <> mtu 1500 qdisc noop state DOWN group default 
qlen 1000

    link/[530]
16: r_rmnet_data3: <> mtu 1500 qdisc noop state DOWN group default 
qlen 1000

    link/[530]
17: r_rmnet_data4: <> mtu 1500 qdisc noop state

Re: [OMPI users] MPI executable fails on ArchLinux on Termux

2021-02-24 Thread Gilles Gouaillardet via users


Can you run


ifconfig

or

ip addr


in both Termux and ArchLinux for Termux?



On 2/25/2021 2:00 PM, LINUS FERNANDES via users wrote:


Why do I see the following error messages when executing |mpirun| on 
ArchLinux for Termux?


The same program executes on Termux without any glitches.

|@localhost:/data/data/com.termux/files/home[root@localhost home] 
mpirun --allow-run-as-root [localhost:06773] opal_ifinit: 
ioctl(SIOCGIFHWADDR) failed with errno=13 [localhost:06773] 
pmix_ifinit: ioctl(SIOCGIFHWADDR) failed with errno=13 
[localhost:06773] oob_tcp: problems getting address for index 83376 
(kernel index -1) 
-- 
No network interfaces were found for out-of-band communications. We 
require at least one available network for out-of-band messaging.|

Re: [OMPI users] weird mpi error report: Type mismatch between arguments

2021-02-17 Thread Gilles Gouaillardet via users

Diego,



IIRC, you now have to build your gfortran 10 apps with

-fallow-argument-mismatch


Cheers,



Gilles

- Original Message -

Dear OPENMPI users,

i'd like to notify you a strange issue that arised right after 
installing a new up-to-date version of Linux (Kubuntu 20.10, with gcc-10 
version )

I am developing a software to be run in distributed memory machines with 
OpenMPI version 4.0.3.

The error seems to be as much simple as weird. Each time i compile the 
program, the following problem is reported:



  692 | call   MPI_BCAST (config,1000*100,MPI_DOUBLE,0,MPI_COMM_WORLD,
ierr) 
 | 2 
 693 | call MPI_BCAST(N,1,MPI_INT,0,MPI_COMM_WORLD,ierr2)
 |   1 
Error: Type mismatch between actual argument at (1) and actual argument 
at (2) (INTEGER(4)/REAL(8)). 




If i compile the same program on an older machine (with an older version 
of gcc, the 9th one), i don't get back any error like this.

Moreover, the command doesn't sound like a syntax error nor a logical 
error, since it represents just two consecutive trivial operations of 
broadcasting, each independent to the other. The first operation 
broadcasts an array named "config" of 10 double elements, while the 
second operation broadcasts a single integer variable named "N"

It seems that the compiler finds some strange links between the line 692 
and 693, and believes that i should put the two arrays "config" and "N" 
in a consistent way, while they are clearly absolutely independent so 
they are not requested to be of the same type.

Searching on the web, i read that this issue could be ascribed  to the 
new version of gcc (the 10th) which contains some well known bugs.

I have even tried to compile with the flag -fallow-argument-mismatch, 
and it fixed some similar report problems indeed, but this specific 
problem remains

what do you think about?

thank you very much

Best Regards



Diego

Re: [OMPI users] GROMACS with openmpi

2021-02-11 Thread Gilles Gouaillardet via users

This is not an Open MPI question, and hence not a fit for this mailing 
list.



But here we go:

first, try

cmake -DGMX_MPI=ON ...

if it fails, try

cmake -DGMX_MPI=ON -DCMAKE_C_COMPILER=mpicc -DCMAKE_CXX_COMPILER=mpicxx .
..



Cheers,



Gilles 

- Original Message -


Hi, MPI developers and users,
   I want to run GROMACS using gmx_mpi rather than gmx, could 
you give me a hand on how to do that? 
Thanks a lot!



Cheers,

Re: [OMPI users] OpenMPI 4.1.0 misidentifies x86 capabilities

2021-02-10 Thread Gilles Gouaillardet via users

Max,

at configure time, Open MPI detects the *compiler* capabilities.
In your case, your compiler can emit AVX512 code.
(and fwiw, the tests are only compiled and never executed)

Then at *runtime*, Open MPI detects the *CPU* capabilities.
In your case, it should not invoke the functions containing AVX512 code.

That being said, several changes were made to the op/avx component,
so if you are experiencing some crashes, I do invite you to give a try to the
latest nightly snapshot for the v4.1.x branch.


Cheers,

Gilles

On Wed, Feb 10, 2021 at 10:43 PM Max R. Dechantsreiter via users
 wrote:
>
> Configuring OpenMPI 4.1.0 with GCC 10.2.0 on
> Intel(R) Xeon(R) CPU E5-2620 v3, a Haswell processor
> that supports AVX2 but not AVX512, resulted in
>
> checking for AVX512 support (no additional flags)... no
> checking for AVX512 support (with -march=skylake-avx512)... yes
>
> in "configure" output, and in config.log
>
> MCA_BUILD_ompi_op_has_avx512_support_FALSE='#'
> MCA_BUILD_ompi_op_has_avx512_support_TRUE=''
>
> Consequently AVX512 intrinsic functions were erroneously
> deployed, resulting in OpenMPI failure.
>
> The relevant test code was in essence
>
> cat > conftest.c << EOF
> #include 
>
> int main()
> {
> __m512 vA, vB;
>
> _mm512_add_ps(vA, vB);
>
> return 0;
> }
> EOF
>
> The problem with this is that the result of the function
> is never used, so at optimization level higher than O0
> the compiler elimates the function as "dead code" (DCE).
> To wit,
>
> gcc -O3 -march=skylake-avx512 -S conftest.c
>
> yields
>
> .file   "conftest.c"
> .text
> .section.text.startup,"ax",@progbits
> .p2align 4
> .globl  main
> .type   main, @function
> main:
> .LFB5345:
> .cfi_startproc
> xorl%eax, %eax
> ret
> .cfi_endproc
> .LFE5345:
> .size   main, .-main
> .ident  "GCC: (GNU) 10.2.0"
> .section.note.GNU-stack,"",@progbits
>
> Compare this with the result of
>
> gcc -O0 -march=skylake-avx512 -S conftest.c
>
> in which the function IS called:
>
> .file   "conftest.c"
> .text
> .globl  main
> .type   main, @function
> main:
> .LFB4092:
> .cfi_startproc
> pushq   %rbp
> .cfi_def_cfa_offset 16
> .cfi_offset 6, -16
> movq%rsp, %rbp
> .cfi_def_cfa_register 6
> andq$-64, %rsp
> subq$136, %rsp
> vmovaps 72(%rsp), %zmm0
> vmovaps %zmm0, -56(%rsp)
> vmovaps 8(%rsp), %zmm0
> vmovaps %zmm0, -120(%rsp)
> movl$0, %eax
> leave
> .cfi_def_cfa 7, 8
> ret
> .cfi_endproc
> .LFE4092:
> .size   main, .-main
> .ident  "GCC: (GNU) 10.2.0"
> .section.note.GNU-stack,"",@progbits
>
> Note the use of a 512-bit ZMM register - ZMM registers
> are used only by AVX512 instructions.  Hence at O3 the
> test program does not detect the lack of AVX512 support
> by the host processor.
>
> An easy remedy would be to declare the operands as
> "volatile" and thereby force to compiler to invoke the
> function:
>
> cat > conftest.c << EOF
> #include 
>
> int main()
> {
> volatile __m512 vA, vB;
>
> _mm512_add_ps(vA, vB);
>
> return 0;
> }
>
> Compiled at O3, the resulting executable dumps core as it
> should when run on my Haswell processor, returning nonzero
> exit status ($?), which would inform "configure" that the
> processor does not have AVX512 capability.
>
> Finally please note that this error could affect the
> detection of support for other instruction sets on other
> families of processors: compiler optimization must be
> inhibited for such tests to be reliable!
>
> Max
> ---
> Max R. Dechantsreiter
> President
> Performance Jones L.L.C.
> m...@performancejones.com
> Skype: PerformanceJones (UTC+01:00)
> +1 414 446-3100 (telephone/voicemail)
> http://www.linkedin.com/in/benchmarking

Re: [OMPI users] OMPI 4.1 in Cygwin packages?

2021-02-04 Thread Gilles Gouaillardet via users

Martin,

at first glance, I could not spot the root cause.

That being said, the second note is sometimes referred as
"WinDev2021Eval" in the logs, but it is also referred as "worker".

What if you use the real names in your hostfile: DESKTOP-C0G4680 and
WinDev2021Eval instead of master and worker?

Cheers,

Gilles

On Fri, Feb 5, 2021 at 5:59 AM Martín Morales via users
 wrote:
>
> Hello all,
>
>
>
> Gilles, unfortunately, the result is the same. Attached the log you ask me.
>
>
>
> Jeff, some time ago I tried with OMPI 4.1.0 (Linux) and it worked.
>
>
>
> Thank you both. Regards,
>
>
>
> Martín
>
>
>
> From: Jeff Squyres (jsquyres) via users
> Sent: jueves, 4 de febrero de 2021 16:10
> To: Open MPI User's List
> Cc: Jeff Squyres (jsquyres)
> Subject: Re: [OMPI users] OMPI 4.1 in Cygwin packages?
>
>
>
> Do we know if this was definitely fixed in v4.1.x?
>
>
> > On Feb 4, 2021, at 7:46 AM, Gilles Gouaillardet via users 
> >  wrote:
> >
> > Martin,
> >
> > this is a connectivity issue reported by the btl/tcp component.
> >
> > You can try restricting the IP interface to a subnet known to work
> > (and with no firewall) between both hosts
> >
> > mpirun --mca btl_tcp_if_include 192.168.0.0/24 ...
> >
> > If the error persists, you can
> >
> > mpirun --mca btl_tcp_base_verbose 20 ...
> >
> > and then compress and post the logs so we can have a look
> >
> >
> > Cheers,
> >
> > Gilles
> >
> > On Thu, Feb 4, 2021 at 9:33 PM Martín Morales via users
> >  wrote:
> >>
> >> Hi Marcos,
> >>
> >>
> >>
> >> Yes, I have a problem with spawning to a “worker” host (on localhost, 
> >> works). There are just two machines: “master” and “worker”.  I’m using 
> >> Windows 10 in both with same Cygwin and packages. Pasted below some 
> >> details.
> >>
> >> Thanks for your help. Regards,
> >>
> >>
> >>
> >> Martín
> >>
> >>
> >>
> >> 
> >>
> >>
> >>
> >> Running:
> >>
> >>
> >>
> >> mpirun -np 1 -hostfile ./hostfile ./spawner.exe 8
> >>
> >>
> >>
> >> hostfile:
> >>
> >>
> >>
> >> master slots=5
> >>
> >> worker slots=5
> >>
> >>
> >>
> >> Error:
> >>
> >>
> >>
> >> At least one pair of MPI processes are unable to reach each other for
> >>
> >> MPI communications.  This means that no Open MPI device has indicated
> >>
> >> that it can be used to communicate between these processes.  This is
> >>
> >> an error; Open MPI requires that all MPI processes be able to reach
> >>
> >> each other.  This error can sometimes be the result of forgetting to
> >>
> >> specify the "self" BTL.
> >>
> >>
> >>
> >> Process 1 ([[31598,1],0]) is on host: DESKTOP-C0G4680
> >>
> >> Process 2 ([[31598,2],2]) is on host: worker
> >>
> >> BTLs attempted: self tcp
> >>
> >>
> >>
> >> Your MPI job is now going to abort; sorry.
> >>
> >> --
> >>
> >> [DESKTOP-C0G4680:02828] [[31598,1],0] ORTE_ERROR_LOG: Unreachable in file 
> >> /pub/devel/openmpi/v4.0/openmpi-4.0.5-1.x86_64/src/openmpi-4.0.5/ompi/dpm/dpm.c
> >>  at line 493
> >>
> >> [DESKTOP-C0G4680:02828] *** An error occurred in MPI_Comm_spawn
> >>
> >> [DESKTOP-C0G4680:02828] *** reported by process [2070806529,0]
> >>
> >> [DESKTOP-C0G4680:02828] *** on communicator MPI_COMM_SELF
> >>
> >> [DESKTOP-C0G4680:02828] *** MPI_ERR_INTERN: internal error
> >>
> >> [DESKTOP-C0G4680:02828] *** MPI_ERRORS_ARE_FATAL (processes in this 
> >> communicator will now abort,
> >>
> >> [DESKTOP-C0G4680:02828] ***and potentially your MPI job)
> >>
> >>
> >>
> >> USER_SSH@DESKTOP-C0G4680 ~
> >>
> >> $ [WinDev2012Eval:00120] [[31598,2],2] ORTE_ERROR_LOG: Unreachable in file 
> >> /pub/devel/openmpi/v4.0/openmpi-4.0.5-1.x86_64/src/openmpi-4.0.5/ompi/dpm/dpm.c
> >>  at line 493
> >>
> >> [WinDev2012Eval:00121] [[31598,2],3] ORTE_ERROR_LOG: Unreachable in file 
> >> /pub/devel/openmpi/v4.0/openmpi-4.0.5-1.x86_64

Re: [OMPI users] OMPI 4.1 in Cygwin packages?

2021-02-04 Thread Gilles Gouaillardet via users

Martin,

this is a connectivity issue reported by the btl/tcp component.

You can try restricting the IP interface to a subnet known to work
(and with no firewall) between both hosts

mpirun --mca btl_tcp_if_include 192.168.0.0/24 ...

If the error persists, you can

mpirun --mca btl_tcp_base_verbose 20 ...

and then compress and post the logs so we can have a look


Cheers,

Gilles

On Thu, Feb 4, 2021 at 9:33 PM Martín Morales via users
 wrote:
>
> Hi Marcos,
>
>
>
> Yes, I have a problem with spawning to a “worker” host (on localhost, works). 
> There are just two machines: “master” and “worker”.  I’m using Windows 10 in 
> both with same Cygwin and packages. Pasted below some details.
>
> Thanks for your help. Regards,
>
>
>
> Martín
>
>
>
> 
>
>
>
> Running:
>
>
>
> mpirun -np 1 -hostfile ./hostfile ./spawner.exe 8
>
>
>
> hostfile:
>
>
>
> master slots=5
>
> worker slots=5
>
>
>
> Error:
>
>
>
> At least one pair of MPI processes are unable to reach each other for
>
> MPI communications.  This means that no Open MPI device has indicated
>
> that it can be used to communicate between these processes.  This is
>
> an error; Open MPI requires that all MPI processes be able to reach
>
> each other.  This error can sometimes be the result of forgetting to
>
> specify the "self" BTL.
>
>
>
> Process 1 ([[31598,1],0]) is on host: DESKTOP-C0G4680
>
> Process 2 ([[31598,2],2]) is on host: worker
>
> BTLs attempted: self tcp
>
>
>
> Your MPI job is now going to abort; sorry.
>
> --
>
> [DESKTOP-C0G4680:02828] [[31598,1],0] ORTE_ERROR_LOG: Unreachable in file 
> /pub/devel/openmpi/v4.0/openmpi-4.0.5-1.x86_64/src/openmpi-4.0.5/ompi/dpm/dpm.c
>  at line 493
>
> [DESKTOP-C0G4680:02828] *** An error occurred in MPI_Comm_spawn
>
> [DESKTOP-C0G4680:02828] *** reported by process [2070806529,0]
>
> [DESKTOP-C0G4680:02828] *** on communicator MPI_COMM_SELF
>
> [DESKTOP-C0G4680:02828] *** MPI_ERR_INTERN: internal error
>
> [DESKTOP-C0G4680:02828] *** MPI_ERRORS_ARE_FATAL (processes in this 
> communicator will now abort,
>
> [DESKTOP-C0G4680:02828] ***and potentially your MPI job)
>
>
>
> USER_SSH@DESKTOP-C0G4680 ~
>
> $ [WinDev2012Eval:00120] [[31598,2],2] ORTE_ERROR_LOG: Unreachable in file 
> /pub/devel/openmpi/v4.0/openmpi-4.0.5-1.x86_64/src/openmpi-4.0.5/ompi/dpm/dpm.c
>  at line 493
>
> [WinDev2012Eval:00121] [[31598,2],3] ORTE_ERROR_LOG: Unreachable in file 
> /pub/devel/openmpi/v4.0/openmpi-4.0.5-1.x86_64/src/openmpi-4.0.5/ompi/dpm/dpm.c
>  at line 493
>
> --
>
> It looks like MPI_INIT failed for some reason; your parallel process is
>
> likely to abort.  There are many reasons that a parallel process can
>
> fail during MPI_INIT; some of which are due to configuration or environment
>
> problems.  This failure appears to be an internal failure; here's some
>
> additional information (which may only be relevant to an Open MPI
>
> developer):
>
>
>
> ompi_dpm_dyn_init() failed
>
> --> Returned "Unreachable" (-12) instead of "Success" (0)
>
> --
>
> [WinDev2012Eval:00121] *** An error occurred in MPI_Init
>
> [WinDev2012Eval:00121] *** reported by process 
> [15289389101093879810,12884901891]
>
> [WinDev2012Eval:00121] *** on a NULL communicator
>
> [WinDev2012Eval:00121] *** Unknown error
>
> [WinDev2012Eval:00121] *** MPI_ERRORS_ARE_FATAL (processes in this 
> communicator will now abort,
>
> [WinDev2012Eval:00121] ***and potentially your MPI job)
>
> [DESKTOP-C0G4680:02831] 2 more processes have sent help message 
> help-mca-bml-r2.txt / unreachable proc
>
> [DESKTOP-C0G4680:02831] Set MCA parameter "orte_base_help_aggregate" to 0 to 
> see all help / error messages
>
> [DESKTOP-C0G4680:02831] 1 more process has sent help message 
> help-mpi-runtime.txt / mpi_init:startup:internal-failure
>
> [DESKTOP-C0G4680:02831] 1 more process has sent help message 
> help-mpi-errors.txt / mpi_errors_are_fatal unknown handle
>
>
>
> Script spawner:
>
>
>
> #include "mpi.h"
>
> #include 
>
> #include 
>
> #include 
>
>
>
> int main(int argc, char ** argv){
>
> int processesToRun;
>
> MPI_Comm intercomm;
>
> MPI_Info info;
>
>
>
>if(argc < 2 ){
>
>   printf("Processes number needed!\n");
>
>   return 0;
>
>}
>
>processesToRun = atoi(argv[1]);
>
> MPI_Init( NULL, NULL );
>
>printf("Spawning from parent:...\n");
>
>MPI_Comm_spawn( "./spawned.exe", MPI_ARGV_NULL, processesToRun, 
> MPI_INFO_NULL, 0, MPI_COMM_SELF, , MPI_ERRCODES_IGNORE);
>
>
>
> MPI_Finalize();
>
> return 0;
>
> }
>
>
>
> Script spawned:
>
>
>
> #include "mpi.h"
>
> #include 
>
> #include 
>
>
>
> int main(int argc, char ** argv){
>
> int hostName_len,rank, size;
>
> MPI_Comm parentcomm;
>
> char

Re: [OMPI users] Debugging a crash

2021-01-29 Thread Gilles Gouaillardet via users

Diego,

the mpirun command line starts 2 MPI task, but the error log mentions
rank 56, so unless there is a copy/paste error, this is highly
suspicious.

I invite you to check the filesystem usage on this node, and make sure
there is a similar amount of available space in /tmp and /dev/shm (or
other filesystem if you use a non standard $TMPDIR

Cheers,

Gilles

On Fri, Jan 29, 2021 at 10:50 PM Diego Zuccato via users
 wrote:
>
> Hello all.
>
> I'm having a problem with a job: if it gets scheduled on a specific node
> of our cluster, it fails with:
> -8<--
> --
> Primary job  terminated normally, but 1 process returned
> a non-zero exit code. Per user-direction, the job has been aborted.
> --
> [str957-mtx-10:38099] *** Process received signal ***
> [str957-mtx-10:38099] Signal: Segmentation fault (11)
> [str957-mtx-10:38099] Signal code: Address not mapped (1)
> [str957-mtx-10:38099] Failing at address: 0x7f98cb266008
> [str957-mtx-10:38099] [ 0]
> /lib/x86_64-linux-gnu/libpthread.so.0(+0x12730)[0x7f98ca553730]
> [str957-mtx-10:38099] [ 1]
> /usr/lib/x86_64-linux-gnu/pmix/lib/pmix/mca_gds_ds21.so(+0x2936)[0x7f98c8a99936]
> [str957-mtx-10:38099] [ 2]
> /lib/x86_64-linux-gnu/libmca_common_dstore.so.1(pmix_common_dstor_init+0x9d3)[0x7f98c8a82733]
> [str957-mtx-10:38099] [ 3]
> /usr/lib/x86_64-linux-gnu/pmix/lib/pmix/mca_gds_ds21.so(+0x25b4)[0x7f98c8a995b4]
> [str957-mtx-10:38099] [ 4]
> /lib/x86_64-linux-gnu/libpmix.so.2(pmix_gds_base_select+0x12e)[0x7f98c8bdc46e]
> [str957-mtx-10:38099] [ 5]
> /lib/x86_64-linux-gnu/libpmix.so.2(pmix_rte_init+0x8cd)[0x7f98c8b9488d]
> [str957-mtx-10:38099] [ 6]
> /lib/x86_64-linux-gnu/libpmix.so.2(PMIx_Init+0xdc)[0x7f98c8b50d7c]
> [str957-mtx-10:38099] [ 7]
> /usr/lib/x86_64-linux-gnu/openmpi/lib/openmpi3/mca_pmix_ext2x.so(ext2x_client_init+0xc4)[0x7f98c8c3afe4]
> [str957-mtx-10:38099] [ 8]
> /usr/lib/x86_64-linux-gnu/openmpi/lib/openmpi3/mca_ess_pmi.so(+0x2656)[0x7f98c946d656]
> [str957-mtx-10:38099] [ 9]
> /lib/x86_64-linux-gnu/libopen-rte.so.40(orte_init+0x29a)[0x7f98ca2c111a]
> [str957-mtx-10:38099] [10]
> /lib/x86_64-linux-gnu/libmpi.so.40(ompi_mpi_init+0x252)[0x7f98cae1ce62]
> [str957-mtx-10:38099] [11]
> /lib/x86_64-linux-gnu/libmpi.so.40(MPI_Init+0x6e)[0x7f98cae4b17e]
> [str957-mtx-10:38099] [12] Arepo(+0x3940)[0x561b45905940]
> [str957-mtx-10:38099] [13]
> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xeb)[0x7f98ca3a409b]
> [str957-mtx-10:38099] [14] Arepo(+0x3d3a)[0x561b45905d3a]
> [str957-mtx-10:38099] *** End of error message ***
> --
> mpiexec noticed that process rank 56 with PID 37999 on node
> str957-mtx-10 exited on signal 11 (Segmentation fault).
> --
> slurmstepd-str957-mtx-00: error: *** JOB 12129 ON str957-mtx-00
> CANCELLED AT 2021-01-28T14:11:33 ***
> -8<--
> [I cut out the other repetitions of the stack trace for brevity.]
>
> The command used to launch it is:
> mpirun --mca mpi_leave_pinned 0 --mca oob_tcp_listen_mode listen_thread
> -np 2 --map-by socket Arepo someargs
>
> The same job, when scheduled to run on another node, works w/o problems.
> For what I could check, the nodes are configured the same (actually
> installed from the same series of scripts and following the same
> procedure: it was a set of 16 nodes and just one is giving troubles).
> I tried with simpler MPI codes and could not reproduce the error. Other
> users are using the same node w/o problems with different codes.
> Packages are the same on all nodes. I already double-checked that kernel
> module config is the same and memlock is unlimited.
> Any hint where to look?
>
> Tks.
>
> --
> Diego Zuccato
> DIFA - Dip. di Fisica e Astronomia
> Servizi Informatici
> Alma Mater Studiorum - Università di Bologna
> V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
> tel.: +39 051 20 95786

Re: [OMPI users] Error with building OMPI with PGI

2021-01-19 Thread Gilles Gouaillardet via users

Passant,

unless this is a copy paste error, the last error message reads plus
zero three, which is clearly an unknown switch
(plus uppercase o three is a known one)

At the end of the configure, make sure Fortran bindings are generated.

If the link error persists, you can
ldd /.../libmpi_mpifh.so | grep igatherv
and confirm the symbol does indeed exists


Cheers,

Gilles

On Tue, Jan 19, 2021 at 7:39 PM Passant A. Hafez via users
 wrote:
>
> Hello Gus,
>
>
> Thanks for your reply.
>
> Yes I've read multiple threads for very old versions of OMPI and PGI before 
> posting, some said it'll be patched so I thought this is fixed in the recent 
> versions. And some fixes didn't work for me.
>
>
>
> Now I tried the first suggestion (CC="pgcc -noswitcherror" as the error is 
> with pgcc)
>
> The OMPI build was finished, but when I tried to use it to build QE GPU 
> version, I got:
>
> undefined reference to `ompi_igatherv_f'
>
>
>
> I tried the other workaround
> https://www.mail-archive.com/users@lists.open-mpi.org/msg10375.html
>
> to rebuild OMPI, I got
> pgcc-Error-Unknown switch: +03
>
>
> Please advise.
>
> All the best,
> Passant
> 
> From: users  on behalf of Gus Correa via 
> users 
> Sent: Friday, January 15, 2021 2:36 AM
> To: Open MPI Users
> Cc: Gus Correa
> Subject: Re: [OMPI users] Error with building OMPI with PGI
>
> Hi Passant, list
>
> This is an old problem with PGI.
> There are many threads in the OpenMPI mailing list archives about this,
> with workarounds.
> The simplest is to use FC="pgf90 -noswitcherror".
>
> Here are two out of many threads ... well,  not pthreads!  :)
> https://www.mail-archive.com/users@lists.open-mpi.org/msg08962.html
> https://www.mail-archive.com/users@lists.open-mpi.org/msg10375.html
>
> I hope this helps,
> Gus Correa
>
> On Thu, Jan 14, 2021 at 5:45 PM Passant A. Hafez via users 
>  wrote:
>>
>> Hello,
>>
>>
>> I'm having an error when trying to build OMPI 4.0.3 (also tried 4.1) with 
>> PGI 20.1
>>
>>
>> ./configure CPP=cpp CC=pgcc CXX=pgc++ F77=pgf77 FC=pgf90 --prefix=$PREFIX 
>> --with-ucx=$UCX_HOME --with-slurm --with-pmi=/opt/slurm/cluster/ibex/install 
>> --with-cuda=$CUDATOOLKIT_HOME
>>
>>
>> in the make install step:
>>
>> make[4]: Leaving directory `/tmp/openmpi-4.0.3/opal/mca/pmix/pmix3x'
>> make[3]: Leaving directory `/tmp/openmpi-4.0.3/opal/mca/pmix/pmix3x'
>> make[2]: Leaving directory `/tmp/openmpi-4.0.3/opal/mca/pmix/pmix3x'
>> Making install in mca/pmix/s1
>> make[2]: Entering directory `/tmp/openmpi-4.0.3/opal/mca/pmix/s1'
>>   CCLD mca_pmix_s1.la
>> pgcc-Error-Unknown switch: -pthread
>> make[2]: *** [mca_pmix_s1.la] Error 1
>> make[2]: Leaving directory `/tmp/openmpi-4.0.3/opal/mca/pmix/s1'
>> make[1]: *** [install-recursive] Error 1
>> make[1]: Leaving directory `/tmp/openmpi-4.0.3/opal'
>> make: *** [install-recursive] Error 1
>>
>> Please advise.
>>
>>
>>
>>
>> All the best,
>> Passant

Re: [OMPI users] 4.1 mpi-io test failures on lustre

2021-01-18 Thread Gilles Gouaillardet via users


Dave,


On 1/19/2021 2:13 AM, Dave Love via users wrote:


Generally it's not surprising if there's a shortage
of effort when outside contributions seem unwelcome.  I've tried to
contribute several times.  The final attempt wasted two or three days,
after being encouraged to get the port of current romio into a decent
state when it was being done separately "behind the scenes", but that
hasn't been released.


External contributions are not only welcome, they are encouraged.

All Pull Requests will be considered for inclusion upstream

(as long as the commits are properly signed-off).

You could not be more wrong on that part, and since you chose to bring 
your this to the public mailing list,


let me recap the facts:



ROMIO is refreshed when needed (and time allows it)

All code changes are coming from public Pull Requests.

For example :

 - ROMIO 3.3.2 refresh (https://github.com/open-mpi/ompi/pull/8249 - 
issued on November 24th 2020)


 - ROMIO 3.4b1 refresh (https://github.com/open-mpi/ompi/pull/8279 - 
issued December 10th 2020)


 - ROMIO 3.4 refresh (https://github.com/open-mpi/ompi/pull/8343 - 
January 6th 2021)



On the other hand, this is what you did:

on December 2nd you wrote to the ML:


In the meantime I've hacked in romio from mpich-4.3b1 without really
understanding what I'm doing;
and finally  posted a link to your code on December 11th (and detailed a 
shortcut you took), before deleting your repository (!) around December 
16th.


Unless I missed it, you never issued a Pull Request.





It took some time to figure out upstream ROMIO 3.3.2 did not pass the 
HDF5 test on Lustre,


and a newer ROMIO (3.4b1 at that time, 3.4 now) had to be used in order 
to fix the issue on the long term.


All the heavy lifting was already done in #8249, very likely before you 
even start hacking, and moving to 3.4b1


and 3.4 was then  very straightforward.


ROMIO 3.4 refresh will be merged in the master branch once properly 
tested and reviewed, and the goal


is to have this available in Open MPI 5.

ROMIO fixes will be applied to the release branches (and they are 
available at https://github.com/open-mpi/ompi/pull/8371)


once tested and reviewed.



Bottom line, your "hack" is the only one that was actually done behind 
the scene,


and has returned there since.

All pull requests are welcome, with- as far as I am concerned - the 
following caveat (besided signed-off commits):


Open MPI is a meritocracy.

If you had issued a proper PR (you did not, but chose to post a - now 
broken - link to your code instead),


it would likely have been rejected based on its (lack of) merits.


There are many ways to contribute to Open MPI, and in this case, 
testing/discussing the Pull Requests/Issues on github


would have been (and will be) very helpful to the Open MPI community.

On the contrary, ranting and bragging on a public ML are - in my not so 
humble opinion - counter productive, but I have a pretty high threshold


for this kind of BS. However, I have a much lower threshold for your 
gross mischaracterization of the Open MPI community, its values, and how 
the work gets done.




Cheers,


Gilles

Re: [OMPI users] Timeout in MPI_Bcast/MPI_Barrier?

2021-01-11 Thread Gilles Gouaillardet via users

Daniel,

the test works in my environment (1 node, 32 GB memory) with all the
mentioned parameters.

Did you check the memory usage on your nodes and made sure the oom
killer did not shoot any process?

Cheers,

Gilles

On Tue, Jan 12, 2021 at 1:48 AM Daniel Torres via users
 wrote:
>
> Hi.
>
> Thanks for responding. I have taken the most important parts from my code and 
> I created a test that reproduces the behavior I described previously.
>
> I attach to this e-mail the compressed file "test.tar.gz". Inside him, you 
> can find:
>
> 1.- The .c source code "test.c", which I compiled with "mpicc -g -O3 test.c 
> -o test -lm". The main work is performed on the function "work_on_grid", 
> starting at line 162.
> 2.- Four execution examples in two different machines (my own and a cluster 
> machine), which I executed with "mpiexec -np 16 --machinefile hostfile 
> --map-by node --mca btl tcp,vader,self --mca btl_base_verbose 100 ./test 4096 
> 4096", varying the last two arguments with 4096, 8192 and 16384 (a matrix 
> size). The error appears with bigger numbers (8192 in my machine, 16384 in 
> the cluster)
> 3.- The "ompi_info -a" output from the two machines.
> 4.- The hostfile.
>
> The duration of the delay is just a few seconds, about 3 ~ 4.
>
> Essentially, the first error message I get from a waiting process is "74: 
> MPI_ERR_PROC_FAILED: Process Failure".
>
> Hope this information can help.
>
> Thanks a lot for your time.
>
> El 08/01/21 a las 18:40, George Bosilca via users escribió:
>
> Daniel,
>
> There are no timeouts in OMPI with the exception of the initial connection 
> over TCP, where we use the socket timeout to prevent deadlocks. As you 
> already did quite a few communicator duplications and other collective 
> communications before you see the timeout, we need more info about this. As 
> Gilles indicated, having the complete output might help. What is the duration 
> of the delay for the waiting process ? Also, can you post a replicator of 
> this issue ?
>
>   George.
>
>
> On Fri, Jan 8, 2021 at 9:03 AM Gilles Gouaillardet via users 
>  wrote:
>>
>> Daniel,
>>
>> Can you please post the full error message and share a reproducer for
>> this issue?
>>
>> Cheers,
>>
>> Gilles
>>
>> On Fri, Jan 8, 2021 at 10:25 PM Daniel Torres via users
>>  wrote:
>> >
>> > Hi all.
>> >
>> > Actually I'm implementing an algorithm that creates a process grid and 
>> > divides it into row and column communicators as follows:
>> >
>> >  col_comm0col_comm1col_comm2 col_comm3
>> > row_comm0P0   P1   P2P3
>> > row_comm1P4   P5   P6P7
>> > row_comm2P8   P9   P10   P11
>> > row_comm3P12  P13  P14   P15
>> >
>> > Then, every process works on its own column communicator and broadcast 
>> > data on row communicators.
>> > While column operations are being executed, processes not included in the 
>> > current column communicator just wait for results.
>> >
>> > In a moment, a column communicator could be splitted to create a temp 
>> > communicator and allow only the right processes to work on it.
>> >
>> > At the end of a step, a call to MPI_Barrier (on a duplicate of 
>> > MPI_COMM_WORLD) is executed to sync all processes and avoid bad results.
>> >
>> > With a small amount of data (a small matrix) the MPI_Barrier call syncs 
>> > correctly on the communicator that includes all processes and processing 
>> > ends fine.
>> > But when the amount of data (a big matrix) is incremented, operations on 
>> > column communicators take more time to finish and hence waiting time also 
>> > increments for waiting processes.
>> >
>> > After a few time, waiting processes return an error when they have not 
>> > received the broadcast (MPI_Bcast) on row communicators or when they have 
>> > finished their work at the sync point (MPI_Barrier). But when the 
>> > operations on the current column communicator end, the still active 
>> > processes try to broadcast on row communicators and they fail because the 
>> > waiting processes have returned an error. So all processes fail in 
>> > different moment in time.
>> >
>> > So my problem is that waiting processes "believe" that the current 
>> > operations have failed (but they have not finished yet!) and they fail too.
>> >
>> > So I have a question about MPI_Bcast/MPI_Barrier:
>> >
>> > Is there a way to increment the timeout a process can wait for a broadcast 
>> > or barrier to be completed?
>> >
>> > Here is my machine and OpenMPI info:
>> > - OpenMPI version: Open MPI 4.1.0u1a1
>> > - OS: Linux Daniel 5.4.0-52-generic #57-Ubuntu SMP Thu Oct 15 10:57:00 UTC 
>> > 2020 x86_64 x86_64 x86_64 GNU/Linux
>> >
>> > Thanks in advance for reading my description/question.
>> >
>> > Best regards.
>> >
>> > --
>> > Daniel Torres
>> > LIPN - Université Sorbonne Paris Nord
>
> --
> Daniel Torres
> LIPN - Université Sorbonne Paris Nord

Re: [OMPI users] Confusing behaviour of compiler wrappers

2021-01-09 Thread Gilles Gouaillardet via users

Sajid,

I believe this is a Spack issue and Open MPI cannot do anything about it.
(long story short, `module load openmpi-xyz` does not set the
environment for the (spack) external `xpmem` library.

I updated the spack issue with some potential workarounds you might
want to give a try.

Cheers,

Gilles

On Sat, Jan 9, 2021 at 8:40 AM Sajid Ali via users
 wrote:
>
> Hi OpenMPI-community,
>
> This is a cross post from the following spack issue : 
> https://github.com/spack/spack/issues/20756
>
> In brief, when I install openmpi@4.1.0 with ucx and xpmem fabrics, the 
> behaviour of the compiler wrappers (mpicc) seems to depend upon the method by 
> which it is loaded into the user environment. When loaded by `spack load`, 
> the compiler wrappers successfully compiler a test program. However, if the 
> same compiler wrappers are loaded via `module load` or as part of a spack 
> environment, they fail. What could possibly cause this inconsistency ?
>
> The build logs and the output of opmi_info are available here 
> (https://we.tl/t-CaiOt7OefS) should it be of any help.
>
> Thank You,
> Sajid Ali (he/him) | PhD Candidate
> Applied Physics
> Northwestern University
> s-sajid-ali.github.io

Re: [OMPI users] Timeout in MPI_Bcast/MPI_Barrier?

2021-01-08 Thread Gilles Gouaillardet via users

Daniel,

Can you please post the full error message and share a reproducer for
this issue?

Cheers,

Gilles

On Fri, Jan 8, 2021 at 10:25 PM Daniel Torres via users
 wrote:
>
> Hi all.
>
> Actually I'm implementing an algorithm that creates a process grid and 
> divides it into row and column communicators as follows:
>
>  col_comm0col_comm1col_comm2 col_comm3
> row_comm0P0   P1   P2P3
> row_comm1P4   P5   P6P7
> row_comm2P8   P9   P10   P11
> row_comm3P12  P13  P14   P15
>
> Then, every process works on its own column communicator and broadcast data 
> on row communicators.
> While column operations are being executed, processes not included in the 
> current column communicator just wait for results.
>
> In a moment, a column communicator could be splitted to create a temp 
> communicator and allow only the right processes to work on it.
>
> At the end of a step, a call to MPI_Barrier (on a duplicate of 
> MPI_COMM_WORLD) is executed to sync all processes and avoid bad results.
>
> With a small amount of data (a small matrix) the MPI_Barrier call syncs 
> correctly on the communicator that includes all processes and processing ends 
> fine.
> But when the amount of data (a big matrix) is incremented, operations on 
> column communicators take more time to finish and hence waiting time also 
> increments for waiting processes.
>
> After a few time, waiting processes return an error when they have not 
> received the broadcast (MPI_Bcast) on row communicators or when they have 
> finished their work at the sync point (MPI_Barrier). But when the operations 
> on the current column communicator end, the still active processes try to 
> broadcast on row communicators and they fail because the waiting processes 
> have returned an error. So all processes fail in different moment in time.
>
> So my problem is that waiting processes "believe" that the current operations 
> have failed (but they have not finished yet!) and they fail too.
>
> So I have a question about MPI_Bcast/MPI_Barrier:
>
> Is there a way to increment the timeout a process can wait for a broadcast or 
> barrier to be completed?
>
> Here is my machine and OpenMPI info:
> - OpenMPI version: Open MPI 4.1.0u1a1
> - OS: Linux Daniel 5.4.0-52-generic #57-Ubuntu SMP Thu Oct 15 10:57:00 UTC 
> 2020 x86_64 x86_64 x86_64 GNU/Linux
>
> Thanks in advance for reading my description/question.
>
> Best regards.
>
> --
> Daniel Torres
> LIPN - Université Sorbonne Paris Nord

Re: [OMPI users] MPMD hostfile: executables on same hosts

2020-12-21 Thread Gilles Gouaillardet via users

Vineet,

probably *not* what you expect, but I guess you can try

$ cat host-file
host1 slots=3
host2 slots=3
host3 slots=3

$ mpirun -hostfile host-file -np 2 ./EXE1 : -np 1 ./EXE2 : -np 2
./EXE1 : -np 1 ./EXE2 : -np 2 ./EXE1 : -np 1 ./EXE2


Cheers,

Gilles

On Mon, Dec 21, 2020 at 10:26 PM Vineet Soni via users
 wrote:
>
> Hello,
>
> I'm having touble using the MPMD hostfile in which I want to place 2 
> executables on the same nodes.
>
> For example, I can do this using Intel MPI by:
> $ mpirun -machine host-file -n 6 ./EXE1 : -n 3 ./EXE2
> $ cat host-file
> host1:2
> host2:2
> host3:2
> host1:1
> host2:1
> host3:1
>
> This would place 2 MPI processes of EXE1 and 1 MPI process of EXE2 on host1.
>
> However, I get an error if I define the same hostname twice in the hostfile 
> of OpenMPI:
> $ mpirun -hostfile host-file -np 6 ./EXE1 : -np 3 ./EXE2
> $ cat host-file
> host1 slots=2 max_slots=3
> host2 slots=2 max_slots=3
> host3 slots=2 max_slots=3
> host1 slots=1 max_slots=3
> host2 slots=1 max_slots=3
> host3 slots=1 max_slots=3
>
> Is there a way to place both executables on the same hosts using a hostfile?
>
> Thanks in advance.
>
> Best,
> Vineet

Re: [OMPI users] MPI_type_free question

2020-12-14 Thread Gilles Gouaillardet via users

Hi Patrick,

Glad to hear you are now able to move forward.

Please keep in mind this is not a fix but a temporary workaround.
At first glance, I did not spot any issue in the current code.
It turned out that the memory leak disappeared when doing things differently

Cheers,

Gilles

On Mon, Dec 14, 2020 at 7:11 PM Patrick Bégou via users <
users@lists.open-mpi.org> wrote:

> Hi Gilles,
>
> you catch the bug! With this patch, on a single node, the memory leak
> disappear. The cluster is actualy overloaded, as soon as possible I will
> launch a multinode test.
> Below the memory used by rank 0 before (blue) and after (red) the patch.
>
> Thanks
>
> Patrick
>
>
> Le 10/12/2020 à 10:15, Gilles Gouaillardet via users a écrit :
>
> Patrick,
>
>
> First, thank you very much for sharing the reproducer.
>
>
> Yes, please open a github issue so we can track this.
>
>
> I cannot fully understand where the leak is coming from, but so far
>
>  - the code fails on master built with --enable-debug (the data engine
> reports an error) but not with the v3.1.x branch
>
>   (this suggests there could be an error in the latest Open MPI ... or in
> the code)
>
>  - the attached patch seems to have a positive effect, can you please give
> it a try?
>
>
> Cheers,
>
>
> Gilles
>
>
>
> On 12/7/2020 6:15 PM, Patrick Bégou via users wrote:
>
> Hi,
>
> I've written a small piece of code to show the problem. Based on my
> application but 2D and using integers arrays for testing.
> The  figure below shows the max RSS size of rank 0 process on 2
> iterations on 8 and 16 cores, with openib and tcp drivers.
> The more processes I have, the larger the memory leak.  I use the same
> binaries for the 4 runs and OpenMPI 3.1 (same behavior with 4.0.5).
> The code is in attachment. I'll try to check type deallocation as soon as
> possible.
>
> Patrick
>
>
>
>
> Le 04/12/2020 à 01:34, Gilles Gouaillardet via users a écrit :
>
> Patrick,
>
>
> based on George's idea, a simpler check is to retrieve the Fortran index
> via the (standard) MPI_Type_c2() function
>
> after you create a derived datatype.
>
>
> If the index keeps growing forever even after you MPI_Type_free(), then
> this clearly indicates a leak.
>
> Unfortunately, this simple test cannot be used to definitely rule out any
> memory leak.
>
>
> Note you can also
>
> mpirun --mca pml ob1 --mca btl tcp,self ...
>
> in order to force communications over TCP/IP and hence rule out any memory
> leak that could be triggered by your fast interconnect.
>
>
>
> In any case, a reproducer will greatly help us debugging this issue.
>
>
> Cheers,
>
>
> Gilles
>
>
>
> On 12/4/2020 7:20 AM, George Bosilca via users wrote:
>
> Patrick,
>
> I'm afraid there is no simple way to check this. The main reason being
> that OMPI use handles for MPI objects, and these handles are not tracked by
> the library, they are supposed to be provided by the user for each call. In
> your case, as you already called MPI_Type_free on the datatype, you cannot
> produce a valid handle.
>
> There might be a trick. If the datatype is manipulated with any Fortran
> MPI functions, then we convert the handle (which in fact is a pointer) to
> an index into a pointer array structure. Thus, the index will remain used,
> and can therefore be used to convert back into a valid datatype pointer,
> until OMPI completely releases the datatype. Look into
> the ompi_datatype_f_to_c_table table to see the datatypes that exist and
> get their pointers, and then use these pointers as arguments to
> ompi_datatype_dump() to see if any of these existing datatypes are the ones
> you define.
>
> George.
>
>
>
>
> On Thu, Dec 3, 2020 at 4:44 PM Patrick Bégou via users <
> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
> > wrote:
>
> Hi,
>
> I'm trying to solve a memory leak since my new implementation of
> communications based on MPI_AllToAllW and MPI_type_Create_SubArray
> calls.  Arrays of SubArray types are created/destroyed at each
> time step and used for communications.
>
> On my laptop the code runs fine (running for 15000 temporal
> itérations on 32 processes with oversubscription) but on our
> cluster memory used by the code increase until the OOMkiller stop
> the job. On the cluster we use IB QDR for communications.
>
> Same Gcc/Gfortran 7.3 (built from sources), same sources of
> OpenMPI (3.1 or 4.0.5 tested), same sources of the fortran code on
> the laptop and on the cluster.
>
> Using Gcc/Gfortran 4.8 and OpenMPI 1.7.3 on the cluster do not
> show the problem (resident memory do not increase and we ran
> 10 temporal iterations)
>
> MPI_type_free manual says that it "/Marks the datatype object
> associated with datatype for deallocation/". But  how can I check
> that the deallocation is really done ?
>
> Thanks for ant suggestions.
>
> Patrick
>
>
>
>

Re: [OMPI users] MPI_type_free question

2020-12-10 Thread Gilles Gouaillardet via users


Patrick,


First, thank you very much for sharing the reproducer.


Yes, please open a github issue so we can track this.


I cannot fully understand where the leak is coming from, but so far

 - the code fails on master built with --enable-debug (the data engine 
reports an error) but not with the v3.1.x branch


  (this suggests there could be an error in the latest Open MPI ... or 
in the code)


 - the attached patch seems to have a positive effect, can you please 
give it a try?



Cheers,


Gilles



On 12/7/2020 6:15 PM, Patrick Bégou via users wrote:

Hi,

I've written a small piece of code to show the problem. Based on my 
application but 2D and using integers arrays for testing.
The  figure below shows the max RSS size of rank 0 process on 2 
iterations on 8 and 16 cores, with openib and tcp drivers.
The more processes I have, the larger the memory leak.  I use the same 
binaries for the 4 runs and OpenMPI 3.1 (same behavior with 4.0.5).
The code is in attachment. I'll try to check type deallocation as soon 
as possible.


Patrick




Le 04/12/2020 à 01:34, Gilles Gouaillardet via users a écrit :

Patrick,


based on George's idea, a simpler check is to retrieve the Fortran 
index via the (standard) MPI_Type_c2() function


after you create a derived datatype.


If the index keeps growing forever even after you MPI_Type_free(), 
then this clearly indicates a leak.


Unfortunately, this simple test cannot be used to definitely rule out 
any memory leak.



Note you can also

mpirun --mca pml ob1 --mca btl tcp,self ...

in order to force communications over TCP/IP and hence rule out any 
memory leak that could be triggered by your fast interconnect.




In any case, a reproducer will greatly help us debugging this issue.


Cheers,


Gilles



On 12/4/2020 7:20 AM, George Bosilca via users wrote:

Patrick,

I'm afraid there is no simple way to check this. The main reason 
being that OMPI use handles for MPI objects, and these handles are 
not tracked by the library, they are supposed to be provided by the 
user for each call. In your case, as you already called 
MPI_Type_free on the datatype, you cannot produce a valid handle.


There might be a trick. If the datatype is manipulated with any 
Fortran MPI functions, then we convert the handle (which in fact is 
a pointer) to an index into a pointer array structure. Thus, the 
index will remain used, and can therefore be used to convert back 
into a valid datatype pointer, until OMPI completely releases the 
datatype. Look into the ompi_datatype_f_to_c_table table to see the 
datatypes that exist and get their pointers, and then use these 
pointers as arguments to ompi_datatype_dump() to see if any of these 
existing datatypes are the ones you define.


George.




On Thu, Dec 3, 2020 at 4:44 PM Patrick Bégou via users 
mailto:users@lists.open-mpi.org>> wrote:


    Hi,

    I'm trying to solve a memory leak since my new implementation of
    communications based on MPI_AllToAllW and MPI_type_Create_SubArray
    calls.  Arrays of SubArray types are created/destroyed at each
    time step and used for communications.

    On my laptop the code runs fine (running for 15000 temporal
    itérations on 32 processes with oversubscription) but on our
    cluster memory used by the code increase until the OOMkiller stop
    the job. On the cluster we use IB QDR for communications.

    Same Gcc/Gfortran 7.3 (built from sources), same sources of
    OpenMPI (3.1 or 4.0.5 tested), same sources of the fortran code on
    the laptop and on the cluster.

    Using Gcc/Gfortran 4.8 and OpenMPI 1.7.3 on the cluster do not
    show the problem (resident memory do not increase and we ran
    10 temporal iterations)

    MPI_type_free manual says that it "/Marks the datatype object
    associated with datatype for deallocation/". But  how can I check
    that the deallocation is really done ?

    Thanks for ant suggestions.

    Patrick



diff --git a/ompi/mca/coll/basic/coll_basic_alltoallw.c 
b/ompi/mca/coll/basic/coll_basic_alltoallw.c
index 93fa880..5aca2c2 100644
--- a/ompi/mca/coll/basic/coll_basic_alltoallw.c
+++ b/ompi/mca/coll/basic/coll_basic_alltoallw.c
@@ -194,7 +194,7 @@ mca_coll_basic_alltoallw_intra(const void *sbuf, const int 
*scounts, const int *
 continue;
 
 prcv = ((char *) rbuf) + rdisps[i];
-err = MCA_PML_CALL(irecv_init(prcv, rcounts[i], rdtypes[i],
+err = MCA_PML_CALL(irecv(prcv, rcounts[i], rdtypes[i],
   i, MCA_COLL_BASE_TAG_ALLTOALLW, comm,
   preq++));
 ++nreqs;
@@ -215,21 +215,15 @@ mca_coll_basic_alltoallw_intra(const void *sbuf, const 
int *scounts, const int *
 continue;
 
 psnd = ((char *) sbuf) + sdisps[i];
-err = MCA_PML_CALL(isend_init(psnd, scounts[i], sdtypes[i],
+err = MCA_PML_CALL(send(psnd, scounts[i], sdtypes[i],

Re: [OMPI users] MPI_type_free question

2020-12-04 Thread Gilles Gouaillardet via users


Patrick,


the test points to a leak in the way the interconnect component (pml/ucx 
? pml/cm? mtl/psm2? btl/openib?) handles the datatype rather than the 
datatype engine itself.



What interconnect is available on your cluster and which component(s) 
are used?



mpirun --mca pml_base_verbose 10 --mca mtl_base_verbose 10 --mca 
btl_base_verbose 10 ...


will point you to the component(s) used.

The output is pretty verbose, so feel free to compress and post it if 
you cannot decipher it



Cheers,


Gilles

On 12/4/2020 4:32 PM, Patrick Bégou via users wrote:

Hi George and Gilles,

Thanks George for your suggestion. Is it valuable for 4.05 and 3.1 
OpenMPI Versions ? I will have a look today at these tables. May be 
writing a small piece of code juste creating and freeing subarray 
datatype.


Thanks Gilles for suggesting disabling the interconnect. it is a good 
fast test and yes, *with "mpirun --mca pml ob1 --mca btl tcp,self" I 
have no memory leak*. So this explain the differences between my 
laptop and the cluster.

The implementation of type management is so different from 1.7.3  ?

A PhD student tells me he has also some trouble with this code on a 
cluster Omnipath based. I will have to investigate too but not sure it 
is the same problem.


Patrick

Le 04/12/2020 à 01:34, Gilles Gouaillardet via users a écrit :

Patrick,


based on George's idea, a simpler check is to retrieve the Fortran 
index via the (standard) MPI_Type_c2() function


after you create a derived datatype.


If the index keeps growing forever even after you MPI_Type_free(), 
then this clearly indicates a leak.


Unfortunately, this simple test cannot be used to definitely rule out 
any memory leak.



Note you can also

mpirun --mca pml ob1 --mca btl tcp,self ...

in order to force communications over TCP/IP and hence rule out any 
memory leak that could be triggered by your fast interconnect.




In any case, a reproducer will greatly help us debugging this issue.


Cheers,


Gilles



On 12/4/2020 7:20 AM, George Bosilca via users wrote:

Patrick,

I'm afraid there is no simple way to check this. The main reason 
being that OMPI use handles for MPI objects, and these handles are 
not tracked by the library, they are supposed to be provided by the 
user for each call. In your case, as you already called 
MPI_Type_free on the datatype, you cannot produce a valid handle.


There might be a trick. If the datatype is manipulated with any 
Fortran MPI functions, then we convert the handle (which in fact is 
a pointer) to an index into a pointer array structure. Thus, the 
index will remain used, and can therefore be used to convert back 
into a valid datatype pointer, until OMPI completely releases the 
datatype. Look into the ompi_datatype_f_to_c_table table to see the 
datatypes that exist and get their pointers, and then use these 
pointers as arguments to ompi_datatype_dump() to see if any of these 
existing datatypes are the ones you define.


George.




On Thu, Dec 3, 2020 at 4:44 PM Patrick Bégou via users 
mailto:users@lists.open-mpi.org>> wrote:


    Hi,

    I'm trying to solve a memory leak since my new implementation of
    communications based on MPI_AllToAllW and MPI_type_Create_SubArray
    calls.  Arrays of SubArray types are created/destroyed at each
    time step and used for communications.

    On my laptop the code runs fine (running for 15000 temporal
    itérations on 32 processes with oversubscription) but on our
    cluster memory used by the code increase until the OOMkiller stop
    the job. On the cluster we use IB QDR for communications.

    Same Gcc/Gfortran 7.3 (built from sources), same sources of
    OpenMPI (3.1 or 4.0.5 tested), same sources of the fortran code on
    the laptop and on the cluster.

    Using Gcc/Gfortran 4.8 and OpenMPI 1.7.3 on the cluster do not
    show the problem (resident memory do not increase and we ran
    10 temporal iterations)

    MPI_type_free manual says that it "/Marks the datatype object
    associated with datatype for deallocation/". But  how can I check
    that the deallocation is really done ?

    Thanks for ant suggestions.

    Patrick

Re: [OMPI users] MPI_type_free question

2020-12-03 Thread Gilles Gouaillardet via users


Patrick,


based on George's idea, a simpler check is to retrieve the Fortran index 
via the (standard) MPI_Type_c2() function


after you create a derived datatype.


If the index keeps growing forever even after you MPI_Type_free(), then 
this clearly indicates a leak.


Unfortunately, this simple test cannot be used to definitely rule out 
any memory leak.



Note you can also

mpirun --mca pml ob1 --mca btl tcp,self ...

in order to force communications over TCP/IP and hence rule out any 
memory leak that could be triggered by your fast interconnect.




In any case, a reproducer will greatly help us debugging this issue.


Cheers,


Gilles



On 12/4/2020 7:20 AM, George Bosilca via users wrote:

Patrick,

I'm afraid there is no simple way to check this. The main reason being 
that OMPI use handles for MPI objects, and these handles are not 
tracked by the library, they are supposed to be provided by the user 
for each call. In your case, as you already called MPI_Type_free on 
the datatype, you cannot produce a valid handle.


There might be a trick. If the datatype is manipulated with any 
Fortran MPI functions, then we convert the handle (which in fact is a 
pointer) to an index into a pointer array structure. Thus, the index 
will remain used, and can therefore be used to convert back into a 
valid datatype pointer, until OMPI completely releases the datatype. 
Look into the ompi_datatype_f_to_c_table table to see the datatypes 
that exist and get their pointers, and then use these pointers as 
arguments to ompi_datatype_dump() to see if any of these existing 
datatypes are the ones you define.


George.




On Thu, Dec 3, 2020 at 4:44 PM Patrick Bégou via users 
mailto:users@lists.open-mpi.org>> wrote:


Hi,

I'm trying to solve a memory leak since my new implementation of
communications based on MPI_AllToAllW and MPI_type_Create_SubArray
calls.  Arrays of SubArray types are created/destroyed at each
time step and used for communications.

On my laptop the code runs fine (running for 15000 temporal
itérations on 32 processes with oversubscription) but on our
cluster memory used by the code increase until the OOMkiller stop
the job. On the cluster we use IB QDR for communications.

Same Gcc/Gfortran 7.3 (built from sources), same sources of
OpenMPI (3.1 or 4.0.5 tested), same sources of the fortran code on
the laptop and on the cluster.

Using Gcc/Gfortran 4.8 and OpenMPI 1.7.3 on the cluster do not
show the problem (resident memory do not increase and we ran
10 temporal iterations)

MPI_type_free manual says that it "/Marks the datatype object
associated with datatype for deallocation/". But  how can I check
that the deallocation is really done ?

Thanks for ant suggestions.

Patrick

Re: [OMPI users] Parallel HDF5 low performance

2020-12-03 Thread Gilles Gouaillardet via users

Patrick,

glad to hear you will upgrade Open MPI thanks to this workaround!

ompio has known performance issues on Lustre (this is why ROMIO is
still the default on this filesystem)
but I do not remember such performance issues have been reported on a
NFS filesystem.

Sharing a reproducer will be very much appreciated in order to improve ompio

Cheers,

Gilles

On Thu, Dec 3, 2020 at 6:05 PM Patrick Bégou via users
 wrote:
>
> Thanks Gilles,
>
> this is the solution.
> I will set OMPI_MCA_io=^ompio automatically when loading the parallel
> hdf5 module on the cluster.
>
> I was tracking this problem for several weeks but not looking in the
> right direction (testing NFS server I/O, network bandwidth.)
>
> I think we will now move definitively to modern OpenMPI implementations.
>
> Patrick
>
> Le 03/12/2020 à 09:06, Gilles Gouaillardet via users a écrit :
> > Patrick,
> >
> >
> > In recent Open MPI releases, the default component for MPI-IO is ompio
> > (and no more romio)
> >
> > unless the file is on a Lustre filesystem.
> >
> >
> > You can force romio with
> >
> > mpirun --mca io ^ompio ...
> >
> >
> > Cheers,
> >
> >
> > Gilles
> >
> > On 12/3/2020 4:20 PM, Patrick Bégou via users wrote:
> >> Hi,
> >>
> >> I'm using an old (but required by the codes) version of hdf5 (1.8.12) in
> >> parallel mode in 2 fortran applications. It relies on MPI/IO. The
> >> storage is NFS mounted on the nodes of a small cluster.
> >>
> >> With OpenMPI 1.7 it runs fine but using modern OpenMPI 3.1 or 4.0.5 the
> >> I/Os are 10x to 100x slower. Are there fundamentals changes in MPI/IO
> >> for these new releases of OpenMPI and a solution to get back to the IO
> >> performances with this parallel HDF5 release ?
> >>
> >> Thanks for your advices
> >>
> >> Patrick
> >>
>

Re: [OMPI users] Parallel HDF5 low performance

2020-12-03 Thread Gilles Gouaillardet via users


Patrick,


In recent Open MPI releases, the default component for MPI-IO is ompio 
(and no more romio)


unless the file is on a Lustre filesystem.


You can force romio with

mpirun --mca io ^ompio ...


Cheers,


Gilles

On 12/3/2020 4:20 PM, Patrick Bégou via users wrote:

Hi,

I'm using an old (but required by the codes) version of hdf5 (1.8.12) in
parallel mode in 2 fortran applications. It relies on MPI/IO. The
storage is NFS mounted on the nodes of a small cluster.

With OpenMPI 1.7 it runs fine but using modern OpenMPI 3.1 or 4.0.5 the
I/Os are 10x to 100x slower. Are there fundamentals changes in MPI/IO
for these new releases of OpenMPI and a solution to get back to the IO
performances with this parallel HDF5 release ?

Thanks for your advices

Patrick

Re: [OMPI users] Unable to run complicated MPI Program

2020-11-28 Thread Gilles Gouaillardet via users

Dean,

That typically occurs when some nodes have multiple interfaces, and
several nodes have a similar IP on a private/unused interface.

I suggest you explicitly restrict the interface Open MPI should be using.
For example, you can

mpirun --mca btl_tcp_if_include eth0 ...

Cheers,

Gilles

On Fri, Nov 27, 2020 at 7:36 PM CHESTER, DEAN (PGR) via users
 wrote:
>
> Hi,
>
> I am trying to set up some machines with OpenMPI connected with ethernet to 
> expand some batch system we already have in use.
>
> This is controlled with Slurm already and we are able to get a basic MPI 
> program running across 2 of the machines but when I compile and something 
> that actually performs communication it fails.
>
> Slurm was not configured with PMI/PMI2 so we require running with mpirun for 
> program execution.
>
> OpenMPI is installed on my home space which is accessible on all of the nodes 
> we are trying to run on.
>
> My hello world application gets the world size, rank and hostname and prints 
> this. This successfully launches and runs.
>
> Hello world from processor viper-03, rank 0 out of 8 processors
> Hello world from processor viper-03, rank 1 out of 8 processors
> Hello world from processor viper-03, rank 2 out of 8 processors
> Hello world from processor viper-03, rank 3 out of 8 processors
> Hello world from processor viper-04, rank 4 out of 8 processors
> Hello world from processor viper-04, rank 5 out of 8 processors
> Hello world from processor viper-04, rank 6 out of 8 processors
> Hello world from processor viper-04, rank 7 out of 8 processors
>
> I then tried to run the OSU micro-benchmarks but these fail to run. I get the 
> following output:
>
> # OSU MPI Latency Test v5.6.3
> # Size  Latency (us)
> [viper-01:25885] [[21336,0],0] ORTE_ERROR_LOG: Data unpack would read past 
> end of buffer in file util/show_help.c at line 507
> --
> WARNING: Open MPI accepted a TCP connection from what appears to be a
> another Open MPI process but cannot find a corresponding process
> entry for that peer.
>
> This attempted connection will be ignored; your MPI job may or may not
> continue properly.
>
>   Local host: viper-02
>   PID:20406
> —
>
> The machines are firewall yet the ports 9000-9060 are open. I have set the 
> following MCA parameters to match the open ports:
>
> btl_tcp_port_min_v4=9000
> btl_tcp_port_range_v4=60
> oob_tcp_dynamic_ipv4_ports=9020
>
> OpenMPI 4.0.5 was built with GCC 4.8.5 and only the installation prefix was 
> set to $HOME/local/ompi.
>
> What else could be going wrong?
>
> Kind Regards,
>
> Dean

Re: [OMPI users] 4.0.5 on Linux Pop!_OS

2020-11-07 Thread Gilles Gouaillardet via users

Paul,

a "slot" is explicitly defined in the error message you copy/pasted:

"If none of a hostfile, the --host command line parameter, or an RM is
present, Open MPI defaults to the number of processor cores"

The error message also lists 4 ways on how you can move forward, but
you should first ask yourself if you really want to run 12 MPI tasks
on your machine.

Cheers,

Gilles

On Sun, Nov 8, 2020 at 11:14 AM Paul Cizmas via users
 wrote:
>
> Hello:
>
> I just installed OpenMPI 4.0.5 on a Linux machine running Pop!_OS (made by 
> System76).  The workstation has the following architecture:
>
> Architecture:x86_64
> CPU op-mode(s):  32-bit, 64-bit
> Byte Order:  Little Endian
> Address sizes:   39 bits physical, 48 bits virtual
> CPU(s):  16
> On-line CPU(s) list: 0-15
> Thread(s) per core:  2
> Core(s) per socket:  8
> Socket(s):   1
> NUMA node(s):1
> Vendor ID:   GenuineIntel
> CPU family:  6
>
> I am trying to run on the Linux box a code that I usually run on a Mac OS 
> without any issues.
>
> The script that I use is:
>
> exe='/usr/bin/mycode' # on jp2
> mympirun='/opt/openmpi/4.0.5/bin/mpirun'   # GFortran on jp2
> $mympirun -np 12  $exe input1
>
> I get the following error:
> 
> No protocol specified
> --
> There are not enough slots available in the system to satisfy the 12
> slots that were requested by the application:
>
>  /usr/bin/mycode
>
> Either request fewer slots for your application, or make more slots
> available for use.
>
> A "slot" is the Open MPI term for an allocatable unit where we can
> launch a process.  The number of slots available are defined by the
> environment in which Open MPI processes are run:
>
>  1. Hostfile, via "slots=N" clauses (N defaults to number of
> processor cores if not provided)
>  2. The --host command line parameter, via a ":N" suffix on the
> hostname (N defaults to 1 if not provided)
>  3. Resource manager (e.g., SLURM, PBS/Torque, LSF, etc.)
>  4. If none of a hostfile, the --host command line parameter, or an
> RM is present, Open MPI defaults to the number of processor cores
>
> In all the above cases, if you want Open MPI to default to the number
> of hardware threads instead of the number of processor cores, use the
> --use-hwthread-cpus option.
>
> Alternatively, you can use the --oversubscribe option to ignore the
> number of available slots when deciding the number of processes to
> launch.
> ===
>
> I do not understand “slots”.  The architecture description of my Linux box 
> lists sockets, cores and threads, but not slots.
>
> What shall I specify instead of "-np 12”?
>
> Thank you,
> Paul

Re: [OMPI users] ompe support for filesystems

2020-10-31 Thread Gilles Gouaillardet via users

Hi Ognen,

MPI-IO is implemented by two components:
 - ROMIO (from MPICH)
 - ompio ("native" Open MPI MPI-IO, default component unless running on Lustre)

Assuming you want to add support for a new filesystem in ompio, first
step is to implement a new component in the fs framework
the framework is in /ompi/mca/fs, and each component is in its own
directory (for example ompi/mca/fs/gpfs)

There are a some configury tricks (create a configure.m4, add Makefile
to autoconf, ...) to make sure your component is even compiled.
If you are struggling with these, feel free to open a Pull Request to
get some help fixing the missing bits.

Cheers,

Gilles

On Sun, Nov 1, 2020 at 12:18 PM Ognen Duzlevski via users
 wrote:
>
> Hello!
>
> If I wanted to support a specific filesystem in open mpi, how is this
> done? What code in the source tree does it?
>
> Thanks!
> Ognen

Re: [OMPI users] Anyone try building openmpi 4.0.5 w/ llvm 11

2020-10-22 Thread Gilles Gouaillardet via users

Alan,

thanks for the report, I addressed this issue in
https://github.com/open-mpi/ompi/pull/8116

As a temporary workaround, you can apply the attached patch.

FWIW, f18 (shipped with LLVM 11.0.0) is still in development and uses
gfortran under the hood.

Cheers,

Gilles

On Wed, Oct 21, 2020 at 12:44 AM Alan Wild via users
 wrote:
>
> More specifically building the new “flang” compiler and compiling openmpi 
> with the combination of clang/flang rather than clang/gfortran.
>
> Configure is passing (including support for 16 byte REAL and COMPLEX types.  
> However there is one file that uses REAL128 and CONPLE(REAL128) and I’m 
> getting type in supported KIND=-1 errors.
>
> (Quick aside I’m surprised that the one file is using the ISO_FORTRAN_ENV 
> type names but configure is only checking for the non-standard REAL*# names.  
> Feels like an oversight to me.)
>
> From what I’ve been able to find (and reading their iso_fortran_env.f90 
> module file the compiler should support the two types. I barely know enough 
> FORTRAN to be dangerous (took one semester of F77 in like 1996) so I’m not 
> really sure what I’m looking at here or what to try next.
>
> I would really like to provide an openmpi build to my users that is a “pure 
> LLVM” build.
>
> -Alan
> --
> a...@madllama.net http://humbleville.blogspot.com


configure.diff
Description: Binary data

Re: [OMPI users] mpirun on Kubuntu 20.4.1 hangs

2020-10-21 Thread Gilles Gouaillardet via users

Hi Jorge,

If a firewall is running on your nodes, I suggest you disable it and try again

Cheers,

Gilles

On Wed, Oct 21, 2020 at 5:50 AM Jorge SILVA via users
 wrote:
>
> Hello,
>
> I installed kubuntu20.4.1 with openmpi 4.0.3-0ubuntu in two different
> computers in the standard way. Compiling with mpif90 works, but mpirun
> hangs with no output in both systems. Even mpirun command without
> parameters hangs and only twice ctrl-C typing can end the sleeping
> program. Only  the command
>
>  mpirun --help
>
> gives the usual output.
>
> Seems that is something related to the terminal output, but the command
> worked well for Kubuntu 18.04. Is there a way to debug or fix this
> problem (without re-compiling from sources, etc)? Is it a known problem?
>
> Thanks,
>
>   Jorge
>

Re: [OMPI users] Issue with shared memory arrays in Fortran

2020-08-24 Thread Gilles Gouaillardet via users

Patrick,

Thanks for the report and the reproducer.

I was able to confirm the issue with python and Fortran, but
 - I can only reproduce it with pml/ucx (read --mca pml ob1 --mca btl
tcp,self works fine)
 - I can only reproduce it with bcast algorithm 8 and 9

As a workaround, you can keep using ucx but manually change the bcast algo

mpirun --mca coll_tuned_use_dynamic_rules 1 --mca
coll_tuned_bcast_algorithm 1 ...

/* you can replace the bcast algorithm with any value between 1 and 7
included */

Cheers,

Gilles

On Mon, Aug 24, 2020 at 10:58 PM Patrick McNally via users
 wrote:
>
> I apologize in advance for the size of the example source and probably the 
> length of the email, but this has been a pain to track down.
>
> Our application uses System V style shared memory pretty extensively and have 
> recently found that in certain circumstances, OpenMPI appears to provide 
> ranks with stale data.  The attached archive contains sample code that 
> demonstrates the issue.  There is a subroutine that uses a shared memory 
> array to broadcast from a single rank on one compute node to a single rank on 
> all other compute nodes.  The first call sends all 1s, then all 2s, and so 
> on.  The receiving rank(s) get all 1s on the first execution, but on 
> subsequent executions they receive some 2s and some 1s; then some 3s, some 
> 2s, and some 1s.  The code contains a version of this routine in both C and 
> Fortran but only the Fortran version appears to exhibit the problem.
>
> I've tried this with OpenMPI 3.1.5, 4.0.2, and 4.0.4 and on two different 
> systems with very different configurations and both show the problem.  On one 
> of the machines, it only appears to happen when MPI is initialized with 
> mpi4py, so I've included that in the test as well.  Other than that, the 
> behavior is very consistent across machines.  When run with the same number 
> of ranks and same size array, the two machines even show the invalid values 
> at the same indices.
>
> Please let me know if you need any additional information.
>
> Thanks,
> Patrick

Re: [OMPI users] ORTE HNP Daemon Error - Generated by Tweaking MTU

2020-08-09 Thread Gilles Gouaillardet via users

John,

I am not sure you will get much help here with a kernel crash caused
by a tweaked driver.

About HPL, you are more likely to get better performance with P and Q
closer (e.g. 4x8 is likely better then 2x16 or 1x32).
Also, HPL might have better performance with one MPI task per node and
a multithreaded BLAS
(e.g. PxQ = 2x4 and 4 OpenMP threads per MPI task)

Cheers,

Gilles


On Mon, Aug 10, 2020 at 3:31 AM John Duffy via users
 wrote:
>
> Hi
>
> I have generated this problem myself by tweaking the MTU of my 8 node 
> Raspberry Pi 4 cluster to 9000 bytes, but I would be grateful for any 
> ideas/suggestions on how to relate the Open-MPI ORTE message to my tweaking.
>
> When I run HPL Linpack using my “improved” cluster, it runs quite happily for 
> 2 hours with P=1 & Q=32 using 80% of memory, and this give me a 7% 
> performance increase to 97 Gflops. And I can quite happily Iperf 1GB of data 
> between nodes with an improved bandwidth of 980Mb/s. So, the MTU tweak 
> appears to be relatively robust.
>
> However, as soon as the HPL.dat parameters change to P=2 & Q=16, from within 
> the same HPL.dat file, I get the following message...
>
> --
> ORTE has lost communication with a remote daemon.
>
>   HNP daemon   : [[19859,0],0] on node node1
>   Remote daemon: [[19859,0],5] on node node6
>
> This is usually due to either a failure of the TCP network
> connection to the node, or possibly an internal failure of
> the daemon itself. We cannot recover from this failure, and
> therefore will terminate the job.
> —
>
> …and the affected node becomes uncontactable.
>
> I’m thinking the Open-MPI message sizes with P=2 & Q=16 are not working with 
> my imperfect MTU tweak, and I’m corrupting the TCP stack somehow.
>
> My tweak consisted of the following kernel changes:
>
> 1.) include/linux/if_vlan.h
>
> #define VLAN_ETH_DATA_LEN 9000
> #define VLAN_ETH_FRAME_LEN 9018
>
> 2.) include/uapi/linux/if_ether.h
>
> #define ETH_DATA_LEN 9000
> #define ETH_FRAME_LEN 9014
>
> 3.) drivers/net/ethernet/broadcom/genet/bcmgenet.c
>
> #define RX_BUF_LENGTH 10240
>
> The Raspberry Pi 4 ethernet driver does not expose many knobs to turn, most 
> ethtool options are not available, and there is no publicly available NIC 
> documentation, so my tweaks are educated guesswork based upon Raspberry Pi 
> forum threads.
>
> Any ideas/suggestions would be much appreciated. With P=2 & Q=16 prior to my 
> tweak I can achieve 100 Gflops, a potential increase to 107 Gflops is not to 
> be sniffed at.
>
> Best regards
>

Re: [OMPI users] MPI is still dominantparadigm?

2020-08-07 Thread Gilles Gouaillardet via users

 The goal of Open MPI is to provide a high quality of the MPI standard,

and the goal of this mailing list is to discuss Open MPI (and not the 
MPI standard)



The Java bindings support "recent" JDK, and if you face an issue, please 
report a bug (either here or on github)



Cheers,



Gilles

- Original Message -

Hello,

This may be a bit of a longer post and I am not sure if it is even 
appropriate here but I figured I ask. There are no hidden agendas in it, 
so please treat it as "asking for opinions/advice", as opposed to 
judging or provoking.

For the period between 2010 to 2017 I used to work in (buzzword alert!) 
"big data" (meaning Spark, HDFS, reactive stuff like Akka) but way 
before that in the early 2000s I used to write basic multithreaded C and 
some MPI code. I came back to HPC/academia two years ago and what struck 
me was that (for lack of better word) the field is still "stuck" (again, 
for lack of better word) on MPI. This itself may seem negative in this 
context, however, I am just stating my observation, which may be wrong.

I like low level programming and I like being in control of what is 
going on but having had the experience in Spark and Akka, I kind of got 
spoiled. Yes, I understand that the latter has fault-tolerance (which is 
nice) and MPI doesn't (or at least, didn't when I played with in 1999-
2005) but I always felt like MPI needed higher level abstractions as a 
CHOICE (not _only_ choice) laid over the bare metal offerings. The whole 
world has moved onto programming in patterns and higher level 
abstractions, why is the academic/HPC world stuck on bare metal, still? 
Yes, I understand that performance often matters and the higher up you 
go, the more performance loss you incur, however, there is also 
something to be said about developer time and ease of understanding/
abstracting etc. etc.

Be that as it may, I am working on a project now in the HPC world and I 
noticed that Open MPI has Java bindings (or should I say "interface"?). 
What is the state of those? Which JDK do they support? Most importantly, 
would it be a HUGE pipe dream to think about building patterns a-la Akka 
(or even mixing actual Akka implementation) on top of OpenMPI via this 
Java bridge? What would be involved on the OpenMPI side? I have time/
interest in going this route if there would be any hope of coming up 
with something that would make my life (and future people coming into 
HPC/MPI) easier in terms of building applications. I am not saying MPI 
in C/C++/Fortran should go away, however, sometimes we don't need the 
low-level stuff to express a concept :-). It may also open a whole new 
world for people on large clusters...

Thank you!

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1019 matches

Mail list logo