Re: [OMPI users] Seg error when using v5.0.1

2024-01-30 Thread Joseph Schuchart via users
Hello, This looks like memory corruption. Do you have more details on what your app is doing? I don't see any MPI calls inside the call stack. Could you rebuild Open MPI with debug information enabled (by adding `--enable-debug` to configure)? If this error occurs on singleton runs (1

Re: [OMPI users] A make error when build openmpi-5.0.0 using the gcc 14.0.0 (experimental) compiler

2023-12-19 Thread Joseph Schuchart via users
Thanks for the report Jorge! I opened a ticket to track the build issues with GCC-14: https://github.com/open-mpi/ompi/issues/12169 Hopefully we will have Open MPI build with GCC-14 before it is released. Cheers, Joseph On 12/17/23 06:03, Jorge D'Elia via users wrote: Hi there, I already

Re: [OMPI users] MPI_Get is slow with structs containing padding

2023-03-30 Thread Joseph Schuchart via users
Hi Antoine, That's an interesting result. I believe the problem with datatypes with gaps is that MPI is not allowed to touch the gaps. My guess is that for the RMA version of the benchmark the implementation either has to revert back to an active message packing the data at the target and

Re: [OMPI users] Tracing of openmpi internal functions

2022-11-16 Thread Joseph Schuchart via users
Arun, You can use a small wrapper script like this one to store the perf data in separate files: ``` $ cat perfwrap.sh #!/bin/bash exec perf record -o perf.data.$OMPI_COMM_WORLD_RANK $@ ``` Then do `mpirun -n ./perfwrap.sh ./a.out` to run all processes under perf. You can also select a

Re: [OMPI users] MPI_THREAD_MULTIPLE question

2022-09-10 Thread Joseph Schuchart via users
Timesir, It sounds like you're using the 4.0.x or 4.1.x release. The one-sided components were cleaned up in the upcoming 5.0.x release and the component in question (osc/pt2pt) was removed. You could also try to compile Open MPI 4.0.x/4.1.x against UCX and use osc/ucx (by passing `--mca osc

[OMPI users] 1st Future of MPI RMA Workshop: Call for Short Talks and Participation

2022-05-29 Thread Joseph Schuchart via users
[Apologies if you got multiple copies of this email.] *1st Future of MPI RMA Workshop (FoRMA'22)* https://mpiwg-rma.github.io/forma22/ The MPI RMA Working Group is organizing a workshop aimed at gathering inputs from users and implementors of MPI RMA with past experiences and ideas for

Re: [OMPI users] Check equality of a value in all MPI ranks

2022-02-17 Thread Joseph Schuchart via users
Hi Niranda, A pattern I have seen in several places is to allreduce the pair p = {-x,x} with MPI_MIN or MPI_MAX. If in the resulting pair p[0] == -p[1], then everyone has the same value. If not, at least one rank had a different value. Example: ``` bool is_same(int x) {   int p[2];   p[0] =

Re: [OMPI users] Using OSU benchmarks for checking Infiniband network

2022-02-11 Thread Joseph Schuchart via users
Analysis  and Profiling Tool  is provided: OSU-INAM Is there something equivalent using openMPI ? Best Denis *From:* users on behalf of Joseph Schuchart via users *Sent:* Tuesday, February 8, 2022 4:02:53 PM *To:* users

Re: [OMPI users] Using OSU benchmarks for checking Infiniband network

2022-02-08 Thread Joseph Schuchart via users
Hi Denis, Sorry if I missed it in your previous messages but could you also try running a different MPI implementation (MVAPICH) to see whether Open MPI is at fault or the system is somehow to blame for it? Thanks Joseph On 2/8/22 03:06, Bertini, Denis Dr. via users wrote: Hi Thanks for

Re: [OMPI users] OpenMPI and maker - Multiple messages

2021-02-18 Thread Joseph Schuchart via users
Thomas, The post you are referencing suggests to run mpiexec -mca btl ^openib -n 40 maker -help but you are running mpiexec -mca btl ^openib -N 5 gcc --version which will run 5 instances of GCC. The out put you're seeing is totally to be expected. I don't think anyone here can help you

Re: [OMPI users] Issue with MPI_Get_processor_name() in Cygwin

2021-02-09 Thread Joseph Schuchart via users
Martin, The name argument to MPI_Get_processor_name is a character string of length at least MPI_MAX_PROCESSOR_NAME, which in OMPI is 256. You are providing a character string of length 200, so OMPI is free to write past the end of your string and into some of your stack variables, hence you

Re: [OMPI users] mpirun on Kubuntu 20.4.1 hangs

2020-10-22 Thread Joseph Schuchart via users
Hi Jorge, Can you try to get a stack trace of mpirun using the following command in a separate terminal? sudo gdb -batch -ex "thread apply all bt" -p $(ps -C mpirun -o pid= | head -n 1) Maybe that will give some insight where mpirun is hanging. Cheers, Joseph On 10/21/20 9:58 PM, Jorge

Re: [OMPI users] Limiting IP addresses used by OpenMPI

2020-09-01 Thread Joseph Schuchart via users
Charles, What is the machine configuration you're running on? It seems that there are two MCA parameter for the tcp btl: btl_tcp_if_include and btl_tcp_if_exclude (see ompi_info for details). There may be other knobs I'm not aware of. If you're using UCX then my guess is that UCX has its own

Re: [OMPI users] Is the mpi.3 manpage out of date?

2020-08-31 Thread Joseph Schuchart via users
Andy, Thanks for pointing this out. We have a merged a fix that corrects that stale comment in master :) Cheers Joseph On 8/25/20 8:36 PM, Riebs, Andy via users wrote: In searching to confirm my belief that recent versions of Open MPI support the MPI-3.1 standard, I was a bit surprised to

Re: [OMPI users] Silent hangs with MPI_Ssend and MPI_Irecv

2020-07-25 Thread Joseph Schuchart via users
Hi Sean, Thanks for the report! I have a few questions/suggestions: 1) What version of Open MPI are you using? 2) What is your network? It sounds like you are on an IB cluster using btl/openib (which is essentially discontinued). Can you try the Open MPI 4.0.4 release with UCX instead of

Re: [OMPI users] MPI test suite

2020-07-24 Thread Joseph Schuchart via users
You may want to look into MTT: https://github.com/open-mpi/mtt Cheers Joseph On 7/23/20 8:28 PM, Zhang, Junchao via users wrote: Hello,   Does OMPI have a test suite that can let me validate MPI implementations from other vendors?   Thanks --Junchao Zhang

Re: [OMPI users] Vader - Where to Look for Shared Memory Use

2020-07-22 Thread Joseph Schuchart via users
Hi John, Depending on your platform the default behavior of Open MPI is to mmap a shared backing file that is either located in a session directory under /dev/shm or under $TMPDIR (I believe under Linux it is /dev/shm). You will find a set of files there that are used to back shared memory.

Re: [OMPI users] Coordinating (non-overlapping) local stores with remote puts form using passive RMA synchronization

2020-06-02 Thread Joseph Schuchart via users
Hi Stephen, Let me try to answer your questions inline (I don't have extensive experience with the separate model and from my experience most implementations support the unified model, with some exceptions): On 5/31/20 1:31 AM, Stephen Guzik via users wrote: Hi, I'm trying to get a better

Re: [OMPI users] RMA in openmpi

2020-04-27 Thread Joseph Schuchart via users
but I just wanted to confirm. Thanks again Claire On 27/04/2020, 07:50, "Joseph Schuchart via users" wrote: Claire, > Is it possible to use the one-sided communication without combining it with synchronization calls? What exactly do you mean by

Re: [OMPI users] RMA in openmpi

2020-04-27 Thread Joseph Schuchart via users
Claire, > Is it possible to use the one-sided communication without combining it with synchronization calls? What exactly do you mean by "synchronization calls"? MPI_Win_fence is indeed synchronizing (basically flush+barrier) but MPI_Win_lock (and the passive target synchronization

[OMPI users] Question about UCX progress throttling

2020-02-07 Thread Joseph Schuchart via users
Today I came across the two MCA parameters osc_ucx_progress_iterations and pml_ucx_progress_iterations in Open MPI. My interpretation of the description is that in a loop such as below, progress in UCX is only triggered every 100 iterations (assuming opal_progress is only called once per

Re: [OMPI users] mpirun --output-filename behavior

2019-10-31 Thread Joseph Schuchart via users
On 10/30/19 2:06 AM, Jeff Squyres (jsquyres) via users wrote: Oh, did the prior behavior *only* output to the file and not to stdout/stderr?  Huh. I guess a workaround for that would be:     mpirun  ... > /dev/null Just to throw in my $0.02: I recently found that the output to

[OMPI users] CPC only supported when the first QP is a PP QP?

2019-08-05 Thread Joseph Schuchart via users
I'm trying to run an MPI RMA application on an IB cluster and find that Open MPI is using the pt2pt rdma component instead of openib (or UCX). I tried getting some logs from Open MPI (current 3.1.x git): ``` $ mpirun -n 2 --mca btl_base_verbose 100 --mca osc_base_verbose 100 --mca

Re: [OMPI users] growing memory use from MPI application

2019-06-20 Thread Joseph Schuchart via users
Noam, Another idea: check for stale files in /dev/shm/ (or a subdirectory that looks like it belongs to UCX/OpenMPI) and SysV shared memory using `ipcs -m`. Joseph On 6/20/19 3:31 PM, Noam Bernstein via users wrote: On Jun 20, 2019, at 4:44 AM, Charles A Taylor >

Re: [OMPI users] Latencies of atomic operations on high-performance networks

2019-05-09 Thread Joseph Schuchart via users
node types. Joseph [1] https://github.com/open-mpi/ompi/issues/6536 On 5/9/19 9:10 AM, Benson Muite via users wrote: Hi, Have you tried anything with OpenMPI 4.0.1? What are the specifications of the Infiniband system you are using? Benson On 5/9/19 9:37 AM, Joseph Schuchart via users wrote

Re: [OMPI users] Latencies of atomic operations on high-performance networks

2019-05-09 Thread Joseph Schuchart via users
Nathan, Over the last couple of weeks I made some more interesting observations regarding the latencies of accumulate operations on both Aries and InfiniBand systems: 1) There seems to be a significant difference between 64bit and 32bit operations: on Aries, the average latency for