Re: [OMPI users] Fwd: Unable to run basic mpirun command (OpenMPI v5.0.3)

2024-05-05 Thread Jeff Squyres (jsquyres) via users
-mpi.org/en/v5.0.x/launching-apps/ssh.html#finding-open-mpi-executables-and-libraries. From: T Brouns Sent: Sunday, May 5, 2024 4:37 PM To: users@lists.open-mpi.org Cc: Jeff Squyres (jsquyres) ; hear...@gmail.com Subject: Re: [OMPI users] Fwd: Unable to run basic

Re: [OMPI users] Fwd: Unable to run basic mpirun command (OpenMPI v5.0.3)

2024-05-04 Thread Jeff Squyres (jsquyres) via users
, your could prefix your LD_LIBRARY_PATH​ environment variable with the libdir from the Open MPI installation you just created. From: T Brouns Sent: Saturday, May 4, 2024 10:56 AM To: Jeff Squyres (jsquyres) ; users@lists.open-mpi.org Subject: Re: [OMPI users

Re: [OMPI users] Fwd: Unable to run basic mpirun command (OpenMPI v5.0.3)

2024-05-03 Thread Jeff Squyres (jsquyres) via users
Your config.log file shows that you are trying to build Open MPI 2.1.6 and that configure failed. I'm not sure how to square this with the information that you provided in your message... did you upload the wrong config.log? Can you provide all the information from

Re: [OMPI users] [EXTERNAL] Help deciphering error message

2024-03-08 Thread Jeff Squyres (jsquyres) via users
(sorry this is so long – it's a bunch of explanations followed by 2 suggestions at the bottom) One additional thing worth mentioning is that your mpirun command line does not seem to explicitly be asking for the "ucx" PML component, but the error message you're getting indicates that you

Re: [OMPI users] Seg error when using v5.0.1

2024-01-31 Thread Jeff Squyres (jsquyres) via users
No worries – glad you figured it out! From: users on behalf of afernandez via users Sent: Wednesday, January 31, 2024 10:56 AM To: Open MPI Users Cc: afernandez Subject: Re: [OMPI users] Seg error when using v5.0.1 Hello, I'm sorry as I totally messed up

Re: [OMPI users] MPI Wireshark Packet Dissector

2023-12-11 Thread Jeff Squyres (jsquyres) via users
Cool! I dimly remember this project; it was written independently of the main Open MPI project. It looks like it supports the TCP OOB and TCP BTL. The TCP OOB has since moved from Open MPI's "ORTE" sub-project to the independent PRRTE project. Regardless, TCP OOB traffic is effectively about

Re: [OMPI users] OpenMPI 5.0.0 & Intel OneAPI 2023.2.0 on MacOS 14.0:

2023-11-06 Thread Jeff Squyres (jsquyres) via users
We develop and build with clang on macOS frequently; it would be surprising if it didn't work. That being said, I was able to replicate both errors report here. One macOS Sonoma with XCode 15.x and the OneAPI compilers: * configure fails in the PMIx libevent section, complaining about how

[OMPI users] Open MPI BOF at SC'23

2023-11-06 Thread Jeff Squyres (jsquyres) via users
We're excited to see everyone next week in Denver, Colorado, USA at SC23! Open MPI will be hosting our usual State of the Union Birds of a Feather (BOF) session on Wednesday, 15, November, 2023, from 12:15-1:15pm US Mountain

Re: [OMPI users] OpenMPI 5.0.0 & Intel OneAPI 2023.2.0 on MacOS 14.0:

2023-10-30 Thread Jeff Squyres (jsquyres) via users
Volker -- If that doesn't work, send all the information requested here: https://docs.open-mpi.org/en/v5.0.x/getting-help.html From: users on behalf of Volker Blum via users Sent: Saturday, October 28, 2023 8:47 PM To: Matt Thompson Cc: Volker Blum ; Open MPI

Re: [OMPI users] MPI4Py Only Using Rank 0

2023-10-25 Thread Jeff Squyres (jsquyres) via users
From: caitlin lamirez Sent: Wednesday, October 25, 2023 1:17 PM To: Jeff Squyres (jsquyres) Subject: Re: [OMPI users] MPI4Py Only Using Rank 0 Hi Jeff, After getting that error, I did reinstall MPI4py using conda remove mpi4py and conda install mpi4py. However, I am still getting th

Re: [OMPI users] MPI4Py Only Using Rank 0

2023-10-25 Thread Jeff Squyres (jsquyres) via users
This usually​ means that you have accidentally switched to using a different MPI implementation under the covers somehow. E.g., did you somehow accidentally start using mpiexec from MPICH instead of Open MPI? Or did MPI4Py somehow get upgraded or otherwise re-build itself for MPICH, but

Re: [OMPI users] Binding to thread 0

2023-09-08 Thread Jeff Squyres (jsquyres) via users
In addition to what Gilles mentioned, I'm curious: is there a reason you have hardware threads enabled? You could disable them in the BIOS, and then each of your MPI processes can use the full core, not just a single hardware thread. From: users on behalf of

Re: [OMPI users] Segmentation fault

2023-08-09 Thread Jeff Squyres (jsquyres) via users
application that replicates the issue? That would be something we could dig into and investigate. From: Aziz Ogutlu Sent: Wednesday, August 9, 2023 10:31 AM To: Jeff Squyres (jsquyres) ; Open MPI Users Subject: Re: [OMPI users] Segmentation fault Hi Jeff, I'm

Re: [OMPI users] Segmentation fault

2023-08-09 Thread Jeff Squyres (jsquyres) via users
. From: Aziz Ogutlu Sent: Wednesday, August 9, 2023 10:08 AM To: Jeff Squyres (jsquyres) ; Open MPI Users Subject: Re: [OMPI users] Segmentation fault Hi Jeff, I also tried with OpenMPI 4.1.5, I got same error. On 8/9/23 17:05, Jeff Squyres (jsquyres) wrote: I'm afraid I

Re: [OMPI users] Segmentation fault

2023-08-09 Thread Jeff Squyres (jsquyres) via users
I'm afraid I don't know anything about the SU2 application. You are using Open MPI v4.0.3, which is fairly old. Many bug fixes have been released since that version. Can you upgrade to the latest version of Open MPI (v4.1.5)? From: users on behalf of Aziz

Re: [OMPI users] [EXT] Re: Error handling

2023-07-19 Thread Jeff Squyres (jsquyres) via users
MPI_Allreduce should work just fine, even with negative numbers. If you are seeing something different, can you provide a small reproducer program that shows the problem? We can dig deeper into if if we can reproduce the problem. mpirun's exit status can't distinguish between MPI processes

Re: [OMPI users] libnuma.so error

2023-07-19 Thread Jeff Squyres (jsquyres) via users
It's not clear if that message is being emitted by Open MPI. It does say it's falling back to a different behavior if libnuma.so is not found, so it appears if it's treating it as a warning, not an error. From: users on behalf of Luis Cebamanos via users Sent:

Re: [OMPI users] Error build Open MPI 4.1.5 with GCC 11.3

2023-07-18 Thread Jeff Squyres (jsquyres) via users
nt: Tuesday, July 18, 2023 12:51 PM To: Jeff Squyres (jsquyres) Cc: Open MPI Users Subject: Re: [OMPI users] Error build Open MPI 4.1.5 with GCC 11.3 As soon as you pointed out /usr/lib/gcc/x86_64-linux-gnu/9/include/float.h that made me think of the previous build. I did "make clean" a _

Re: [OMPI users] Error build Open MPI 4.1.5 with GCC 11.3

2023-07-18 Thread Jeff Squyres (jsquyres) via users
if you had run make clean and then re-ran configure, it probably would have built ok. But deleting the whole source tree and re-configuring + re-building also works.  From: Jeffrey Layton Sent: Tuesday, July 18, 2023 11:38 AM To: Jeff Squyres (jsquyres) Cc: Ope

Re: [OMPI users] Error build Open MPI 4.1.5 with GCC 11.3

2023-07-17 Thread Jeff Squyres (jsquyres) via users
That's a little odd. Usually, the specific .h files that are listed as dependencies came from somewhere​ -- usually either part of the GNU Autotools dependency analysis. I'm guessing that /usr/lib/gcc/x86_64-linux-gnu/9/include/float.h doesn't actually exist on your system -- but then how did

Re: [OMPI users] OMPI compilation error in Making all datatypes

2023-07-12 Thread Jeff Squyres (jsquyres) via users
ge Bosilca Sent: Wednesday, July 12, 2023 2:26 PM To: Open MPI Users Cc: Jeff Squyres (jsquyres) ; Elad Cohen Subject: Re: [OMPI users] OMPI compilation error in Making all datatypes I can't replicate this on my setting, but I am not using the tar archive from the OMPI website (I use the git ta

Re: [OMPI users] OMPI compilation error in Making all datatypes

2023-07-12 Thread Jeff Squyres (jsquyres) via users
The output you sent (in the attached tarball) in doesn't really make much sense: libtool: link: ar cru .libs/libdatatype_reliable.a .libs/libdatatype_reliable_la-opal_datatype_pack.o .libs/libdatatype_reliable_la-opal_datatype_unpack.o libtool: link: ranlib .libs/libdatatype_reliable.a

Re: [OMPI users] Issue with Running MPI Job on CentOS 7

2023-06-14 Thread Jeff Squyres (jsquyres) via users
plications in the "examples" directory. From: 深空探测 Sent: Tuesday, June 13, 2023 8:59 PM To: Open MPI Users Cc: John Hearns ; Jeff Squyres (jsquyres) ; gilles.gouaillar...@gmail.com ; t...@pasteur.fr Subject: Re: [OMPI users] Issue with Running MPI Job on Cent

Re: [OMPI users] Issue with Running MPI Job on CentOS 7

2023-06-12 Thread Jeff Squyres (jsquyres) via users
Your steps are generally correct, but I cannot speak for whether your /home/wude/.bashrc file is executed for both non-interactive and interactive logins. If /home/wude is your $HOME, it probably is, but I don't know about your specific system. Also, you should be aware that MPI applications

Re: [MTT users] MPI error 11 segmentation fault

2023-06-05 Thread Jeff Squyres (jsquyres) via mtt-users
Greetings Macro. I think you directed this email to the wrong mailing list -- this list is for users of the MPI Testing Tool, which is a specific tool that we use in the development and testing of Open MPI itself. General user errors should likely be reported either to the user's mailing list

Re: [OMPI users] What is the best choice of pml and btl for intranode communication

2023-03-06 Thread Jeff Squyres (jsquyres) via users
is selected make my above comment moot. Sorry for any confusion! From: users on behalf of Jeff Squyres (jsquyres) via users Sent: Monday, March 6, 2023 10:40 AM To: Chandran, Arun ; Open MPI Users Cc: Jeff Squyres (jsquyres) Subject: Re: [OMPI users] What

Re: [OMPI users] What is the best choice of pml and btl for intranode communication

2023-03-06 Thread Jeff Squyres (jsquyres) via users
is an open question to George / the UCX team) From: Chandran, Arun Sent: Monday, March 6, 2023 10:31 AM To: Jeff Squyres (jsquyres) ; Open MPI Users Subject: RE: [OMPI users] What is the best choice of pml and btl for intranode communication [Public] Hi, Yes, i

Re: [OMPI users] What is the best choice of pml and btl for intranode communication

2023-03-06 Thread Jeff Squyres (jsquyres) via users
If this run was on a single node, then UCX probably disabled itself since it wouldn't be using InfiniBand or RoCE to communicate between peers. Also, I'm not sure your command line was correct: perf_benchmark $ mpirun -np 32 --map-by core --bind-to core ./perf --mca pml ucx You probably

Re: [OMPI users] Compile options to disable Infiniband

2022-12-12 Thread Jeff Squyres (jsquyres) via users
You can use: ./configure --enable-mca-no-build=btl-openib,pml-ucx,mtl-psm That should probably do it in the 3.x and 4.x series. You can double check after it installs: look in $prefix/lib/openmpi for any files with "ucx", "openib", or "psm" in them. If they're there, remove them (those

Re: [OMPI users] mpi program gets stuck

2022-12-07 Thread Jeff Squyres (jsquyres) via users
is continuing to investigate. If it turns into a problem with Open MPI, we'll report back here. -- Jeff Squyres jsquy...@cisco.com From: Jeff Squyres (jsquyres) Sent: Wednesday, November 30, 2022 7:42 AM To: timesir ; Open MPI Users Subject: Re: mpi program gets stuck Ok

Re: [OMPI users] Can't run an MPI program through mpirun command

2022-12-04 Thread Jeff Squyres (jsquyres) via users
Can you try steps 1-3 in https://docs.open-mpi.org/en/v5.0.x/validate.html#testing-your-open-mpi-installation ? -- Jeff Squyres jsquy...@cisco.com From: users on behalf of Blaze Kort via users Sent: Saturday, December 3, 2022 5:52 AM To:

Re: [OMPI users] mpi program gets stuck

2022-12-01 Thread Jeff Squyres (jsquyres) via users
, November 29, 2022 9:44 PM To: Jeff Squyres (jsquyres) Subject: Re: mpi program gets stuck Do you think the information below is enough? If not, I will add more (py3.9) ➜ /share cat hosts 192.168.180.48 slots=1 192.168.60.203 slots=1 (py3.9) ➜ examples mpirun -n 2 --machinefile hosts --mca

Re: [OMPI users] mpi program gets stuck

2022-11-29 Thread Jeff Squyres (jsquyres) via users
c and ring_c. -- Jeff Squyres jsquy...@cisco.com From: timesir Sent: Tuesday, November 29, 2022 10:42 AM To: Jeff Squyres (jsquyres) ; Open MPI Users Subject: mpi program gets stuck see also: https://pastebin.com/s5tjaUkF (py3.9) ➜ /share cat host

Re: [OMPI users] CephFS and striping_factor

2022-11-29 Thread Jeff Squyres (jsquyres) via users
More specifically, Gilles created a skeleton "ceph" component in this draft pull request: https://github.com/open-mpi/ompi/pull/11122 If anyone has any cycles to work on it and develop it beyond the skeleton that is currently there, that would be great! -- Jeff Squyres jsquy...@cisco.com

Re: [OMPI users] Question about "mca" parameters

2022-11-29 Thread Jeff Squyres (jsquyres) via users
Also, you probably want to add "vader" into your BTL specification. Although the name is counter-intuitive, "vader" in Open MPI v3.x and v4.x is the shared memory transport. Hence, if you run with "btl=tcp,self", you are only allowing MPI processes to talk via the TCP stack or process

Re: [OMPI users] users Digest, Vol 4818, Issue 1

2022-11-25 Thread Jeff Squyres (jsquyres) via users
: timesir Sent: Friday, November 18, 2022 10:20 AM To: Jeff Squyres (jsquyres) ; users@lists.open-mpi.org ; gilles.gouaillar...@gmail.com Subject: Re: users Digest, Vol 4818, Issue 1 (py3.9) ➜ /share ompi_info --version Open MPI v5.0.0rc9 https://www.open-mpi.org/community/help/ (py3.9

Re: [OMPI users] users Digest, Vol 4818, Issue 1

2022-11-25 Thread Jeff Squyres (jsquyres) via users
ot;em dash", or somesuch. -- Jeff Squyres jsquy...@cisco.com From: timesir Sent: Friday, November 18, 2022 8:59 AM To: Jeff Squyres (jsquyres) ; users@lists.open-mpi.org ; gilles.gouaillar...@gmail.com Subject: Re: users Digest, Vol 4818, Issue 1 The

Re: [OMPI users] users Digest, Vol 4818, Issue 1

2022-11-25 Thread Jeff Squyres (jsquyres) via users
Monday, November 14, 2022 11:32 PM To: users@lists.open-mpi.org ; Jeff Squyres (jsquyres) ; gilles.gouaillar...@gmail.com Subject: Re: users Digest, Vol 4818, Issue 1 (py3.9) ➜ /share mpirun -n 2 --machinefile hosts --mca rmaps_base_verbose 100 --mca ras_base_verbose 100 which mpirun [compute

Re: [OMPI users] users Digest, Vol 4818, Issue 1

2022-11-25 Thread Jeff Squyres (jsquyres) via users
From: timesir Sent: Friday, November 18, 2022 8:49 AM To: Jeff Squyres (jsquyres) ; users@lists.open-mpi.org ; gilles.gouaillar...@gmail.com Subject: Re: users Digest, Vol 4818, Issue 1 The information you need is attached. 在 2022/11/18 21:08, Jeff Squyres (jsquyres) 写道: Yes

Re: [OMPI users] Tracing of openmpi internal functions

2022-11-14 Thread Jeff Squyres (jsquyres) via users
Open MPI uses plug-in modules for its implementations of the MPI collective algorithms. From that perspective, once you understand that infrastructure, it's exactly the same regardless of whether the MPI job is using intra-node or inter-node collectives. We don't have much in the way of

Re: [OMPI users] [OMPI devel] There are not enough slots available in the system to satisfy the 2, slots that were requested by the application

2022-11-14 Thread Jeff Squyres (jsquyres) via users
reads instead of the number of processor cores, use the --use-hwthread-cpus option. Alternatively, you can use the --map-by :OVERSUBSCRIBE option to ignore the number of available slots when deciding the number of processes to launch. -------

Re: [OMPI users] [OMPI devel] There are not enough slots available in the system to satisfy the 2, slots that were requested by the application

2022-11-13 Thread Jeff Squyres (jsquyres) via users
_ From: 龙龙 Sent: Sunday, November 13, 2022 3:13 AM To: Jeff Squyres (jsquyres) ; Open MPI Users Subject: Re: [OMPI devel] There are not enough slots available in the system to satisfy the 2, slots that were requested by the application (py3.9) ➜ /share mpirun –version mpirun (Open MPI) 5.0.0r

Re: [OMPI users] --mca btl_base_verbose 30 not working in version 5.0

2022-11-07 Thread Jeff Squyres (jsquyres) via users
Sorry for the delay in replying. To tie up this thread for the web mail archives: this same question was cross-posted over in the devel list; I replied there. -- Jeff Squyres jsquy...@cisco.com From: users on behalf of mrlong via users Sent: Sunday, October

Re: [OMPI users] [OMPI devel] There are not enough slots available in the system to satisfy the 2, slots that were requested by the application

2022-11-07 Thread Jeff Squyres (jsquyres) via users
In the future, can you please just mail one of the lists? This particular question is probably more of a users type of question (since we're not talking about the internals of Open MPI itself), so I'll reply just on the users list. For what it's worth, I'm unable to replicate your error: $

Re: [OMPI users] [EXTERNAL] Beginner Troubleshooting OpenMPI Installation - pmi.h Error

2022-10-06 Thread Jeff Squyres (jsquyres) via users
uot; is in this file? If that's the case, then that's where Open MPI is getting these CLI arguments. -- Jeff Squyres jsquy...@cisco.com From: Jeffrey D. (JD) Tamucci Sent: Wednesday, October 5, 2022 5:16 PM To: Jeff Squyres (jsquyres) Cc: Open MPI Users ; Pritchard Jr.,

Re: [OMPI users] [EXTERNAL] Beginner Troubleshooting OpenMPI Installation - pmi.h Error

2022-10-05 Thread Jeff Squyres (jsquyres) via users
Actually, I think the problem might be a little more subtle. I see that you configured with both --enable-static and --enable-shared. My gut reaction is that there might be some kind of issue with enabling both of those options (by default, shared is enabled and static is disabled). If you

Re: [OMPI users] openmpi compile failure

2022-09-28 Thread Jeff Squyres (jsquyres) via users
.deps/signal.Tpo -c \ ../../../../../../opal/mca/event/libevent2022/libevent/signal.c -fPIC \ -DPIC -E > signal-preprocessed.c -- Jeff Squyres jsquy...@cisco.com From: Zilore Mumba Sent: Wednesday, September 28, 2022 1:50 AM To: Jeff Squyres (jsquyres) Cc: users@lists.op

Re: [OMPI users] openmpi compile failure

2022-09-27 Thread Jeff Squyres (jsquyres) via users
mber 27, 2022 2:51 PM To: Jeff Squyres (jsquyres) Cc: users@lists.open-mpi.org Subject: Re: [OMPI users] openmpi compile failure Thanks Jeff, I have tried with openmpi-4.1.4, but I still get the same error. The main error being ../../../../../../opal/mca/event/libevent2022/libevent/signal.c:135

Re: [OMPI users] openmpi compile failure

2022-09-27 Thread Jeff Squyres (jsquyres) via users
Can you re-try with the latest Open MPI v4.1.x release (v4.1.4)? There have been many bug fixes since v4.1.0. -- Jeff Squyres jsquy...@cisco.com From: users on behalf of Zilore Mumba via users Sent: Tuesday, September 27, 2022 5:10 AM To:

Re: [OMPI users] --mca parameter explainer; mpirun WARNING: There was an error initializing an OpenFabrics device

2022-09-26 Thread Jeff Squyres (jsquyres) via users
Just to follow up for the email web archives: this issue was followed up in https://github.com/open-mpi/ompi/issues/10841. -- Jeff Squyres jsquy...@cisco.com From: users on behalf of Rob Kudyba via users Sent: Thursday, September 22, 2022 2:15 PM To:

Re: [OMPI users] Hardware topology influence

2022-09-14 Thread Jeff Squyres (jsquyres) via users
). This will allow the MPI processes to use shared memory for on-node communication. -- Jeff Squyres jsquy...@cisco.com From: Jeff Squyres (jsquyres) Sent: Tuesday, September 13, 2022 10:08 AM To: Open MPI Users Cc: Gilles Gouaillardet Subject: Re: [OMPI users

Re: [OMPI users] Hardware topology influence

2022-09-13 Thread Jeff Squyres (jsquyres) via users
Let me add a little more color on what Gilles stated. First, you should probably upgrade to the latest v4.1.x release: v4.1.4. It has a bunch of bug fixes compared to v4.1.0. Second, you should know that it is relatively uncommon to run HPC/MPI apps inside VMs because the virtualization

Re: [OMPI users] Disabling barrier in MPI_Finalize

2022-09-09 Thread Jeff Squyres (jsquyres) via users
No, it does not, sorry. What are you trying to do? -- Jeff Squyres jsquy...@cisco.com From: users on behalf of Mccall, Kurt E. (MSFC-EV41) via users Sent: Friday, September 9, 2022 2:30 PM To: OpenMpi User List (users@lists.open-mpi.org) Cc: Mccall, Kurt E.

Re: [OMPI users] MPI with RoCE

2022-09-06 Thread Jeff Squyres (jsquyres) via users
You can think of RoCE as "IB over IP" -- RoCE is essentially the IB protocol over IP packets (which is different than IPoIB, which is emulating IP and TCP over the InfiniBand protocol). You'll need to consult the docs for your Mellanox cards, but if you have Ethernet cards, you'll want to set

Re: [OMPI users] ucx problems

2022-08-31 Thread Jeff Squyres (jsquyres) via users
Yes, that is the intended behavior: Open MPI basically only uses UCX for IB transports (and shared memory -- but only when also used with IB transports). If IB can't be used, the UCX PML disqualifies itself. This is by design, even though UCX can handle other transports (including TCP and

Re: [OMPI users] Oldest version of SLURM in use?

2022-08-17 Thread Jeff Squyres (jsquyres) via users
com From: Tim Carlson Sent: Wednesday, August 17, 2022 11:34 AM To: Open MPI Users Cc: Jeff Squyres (jsquyres) Subject: Re: [OMPI users] Oldest version of SLURM in use? To be honest, I only upgrade SLURM when there is a feature I absolutely have to have, or a big bug that needs to be fixed. L

Re: [OMPI users] Oldest version of SLURM in use?

2022-08-17 Thread Jeff Squyres (jsquyres) via users
orner. Pardon my rambling, the upshot is, some lazy/disorganized people rely on third-party packagers, and do get pretty far behind. On Tue, Aug 16, 2022 at 9:54 AM Jeff Squyres (jsquyres) via users mailto:users@lists.open-mpi.org>> wrote: I have a curiosity question for the Open MPI user community

[OMPI users] Oldest version of SLURM in use?

2022-08-16 Thread Jeff Squyres (jsquyres) via users
I have a curiosity question for the Open MPI user community: what version of SLURM are you using? I ask because we're honestly curious about what the expectations are regarding new versions of Open MPI supporting older versions of SLURM. I believe that SchedMD's policy is that they support up

Re: [OMPI users] RUNPATH vs. RPATH

2022-08-11 Thread Jeff Squyres (jsquyres) via users
jsquy...@cisco.com From: Reuti Sent: Tuesday, August 9, 2022 12:03 PM To: Open MPI Users Cc: Jeff Squyres (jsquyres); zuelc...@staff.uni-marburg.de Subject: Re: [OMPI users] RUNPATH vs. RPATH Hi Jeff, > Am 09.08.2022 um 16:17 schrieb Jeff Squyres (jsquyres)

Re: [OMPI users] RUNPATH vs. RPATH

2022-08-10 Thread Jeff Squyres (jsquyres) via users
@cisco.com From: Reuti Sent: Tuesday, August 9, 2022 12:03 PM To: Open MPI Users Cc: Jeff Squyres (jsquyres); zuelc...@staff.uni-marburg.de Subject: Re: [OMPI users] RUNPATH vs. RPATH Hi Jeff, > Am 09.08.2022 um 16:17 schrieb Jeff Squyres (jsquyres) via users > : > > Just to fo

[OMPI users] Open MPI Java MPI bindings

2022-08-09 Thread Jeff Squyres (jsquyres) via users
During a planning meeting for Open MPI v5.0.0 today, the question came up: is anyone using the Open MPI Java bindings? These bindings are not​ official MPI Forum bindings -- they are an Open MPI-specific extension. They were added a few years ago as a result of a research project. We ask

Re: [OMPI users] RUNPATH vs. RPATH

2022-08-09 Thread Jeff Squyres (jsquyres) via users
for your environment, but you might want to check the output of "readelf -d ..." to be sure. Does that additional text help explain things? -- Jeff Squyres jsquy...@cisco.com ____ From: Jeff Squyres (jsquyres) Sent: Saturday, August 6, 2022 9:36 AM To: Open

Re: [OMPI users] Problem with OpenMPI as Third pary library

2022-08-09 Thread Jeff Squyres (jsquyres) via users
I can't see the image that you sent; it seems to be broken. But I think you're asking about this: https://www.open-mpi.org/faq/?category=building#installdirs -- Jeff Squyres jsquy...@cisco.com From: users on behalf of Sebastian Gutierrez via users Sent:

Re: [OMPI users] RUNPATH vs. RPATH

2022-08-06 Thread Jeff Squyres (jsquyres) via users
Reuti -- See my disclaimers on other posts about apologies for taking so long to reply! This code was written forever ago; I had to dig through it a bit, read the comments and commit messages, and try to remember why it was done this way. What I thought would be a 5-minute search turned into

Re: [OMPI users] Multiple IPs on network interface

2022-07-07 Thread Jeff Squyres (jsquyres) via users
Can you send the full output of "ifconfig" (or "ip addr") from one of your compute nodes? -- Jeff Squyres jsquy...@cisco.com From: users on behalf of George Johnson via users Sent: Monday, July 4, 2022 11:06 AM To: users@lists.open-mpi.org Cc: George

Re: [OMPI users] Intercommunicator issue (any standard about communicator?)

2022-06-24 Thread Jeff Squyres (jsquyres) via users
Open MPI and MPICH are completely unrelated -- we're entirely different code bases (note that Intel MPI is derived from MPICH). Case in point is what Gilles cited: Open MPI chose to implement MPI_Comm handles as pointers, but MPICH chose to implement MPI_Comm handles as integers. Hence, you

Re: [OMPI users] Intercommunicator issue (any standard about communicator?)

2022-06-24 Thread Jeff Squyres (jsquyres) via users
Guillaume -- There is an MPI Standard document that you can obtain from mpi-forum.org. Open MPI v4.x adheres to MPI version 3.1 (the latest version of the MPI standard is v4.0, but that is unrelated to Open MPI's version number). Frankly, Open MPI's support of the dynamic API functionality

Re: [OMPI users] OpenMPI and names of the nodes in a cluster

2022-06-24 Thread Jeff Squyres (jsquyres) via users
s jsquy...@cisco.com From: Patrick Begou Sent: Tuesday, June 21, 2022 12:10 PM To: Jeff Squyres (jsquyres); Open MPI Users Subject: Re: [OMPI users] OpenMPI and names of the nodes in a cluster Hi Jeff, Unfortunately the workaround with "--mca reg

Re: [OMPI users] OpenMPI and names of the nodes in a cluster

2022-06-16 Thread Jeff Squyres (jsquyres) via users
ee with the standard. Patrick [cid:part1.KfzAgK4Q.PG6VadQJ@univ-grenoble-alpes.fr] Le 16/06/2022 à 14:30, Jeff Squyres (jsquyres) a écrit : What exactly is the error that is occurring? -- Jeff Squyres jsquy...@cisco.com<mailto:jsquy...@cisco.com> From:

Re: [OMPI users] OpenMPI and names of the nodes in a cluster

2022-06-16 Thread Jeff Squyres (jsquyres) via users
What exactly is the error that is occurring? -- Jeff Squyres jsquy...@cisco.com From: users on behalf of Patrick Begou via users Sent: Thursday, June 16, 2022 3:21 AM To: Open MPI Users Cc: Patrick Begou Subject: [OMPI users] OpenMPI and names of the

[OMPI users] Passing of an MPI luminary: Rusty Lusk

2022-05-23 Thread Jeff Squyres (jsquyres) via users
In case you had not heard, Dr. Ewing "Rusty" Lusk passed away at age 78 last week. Rusty was one of the founders and prime movers of the entire MPI ecosystem: the MPI Forum, the MPI standard, and MPICH. Without Rusty, our community would not exist. In addition to all of that, he was an

Re: [OMPI users] Network traffic packets documentation

2022-05-17 Thread Jeff Squyres (jsquyres) via users
ts source code repo: https://github.com/openpmix/openpmix/. It's a different project than Open MPI, but you can certainly ask questions on their mailing lists, too. -- Jeff Squyres jsquy...@cisco.com From: victor sv Sent: Tuesday, May 17, 2022 4:00 AM To: Je

Re: [OMPI users] Network traffic packets documentation

2022-05-16 Thread Jeff Squyres (jsquyres) via users
: victor sv Sent: Monday, May 16, 2022 1:17 PM To: Jeff Squyres (jsquyres) Cc: users@lists.open-mpi.org Subject: Re: [OMPI users] Network traffic packets documentation Hi Jeff, Ok, maybe "packet headers" are not the right words. What I would like to know is how MPI application data is

Re: [OMPI users] Network traffic packets documentation

2022-05-16 Thread Jeff Squyres (jsquyres) via users
Open MPI doesn't proscribe a specific network protocol for anything. Indeed, each network transport uses their own protocols, headers, etc. It's basically a "each Open MPI plugin needs to be able to talk to itself", and therefore no commonality is needed (or desired). Which network and Open

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-05 Thread Jeff Squyres (jsquyres) via users
- Jeff Squyres jsquy...@cisco.com From: users on behalf of Jeff Squyres (jsquyres) via users Sent: Thursday, May 5, 2022 3:31 PM To: George Bosilca; Open MPI Users Cc: Jeff Squyres (jsquyres) Subject: Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3 Scott a

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-05 Thread Jeff Squyres (jsquyres) via users
3:19 PM To: Open MPI Users Cc: Jeff Squyres (jsquyres); Scott Sayres Subject: Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3 That is weird, but maybe it is not a deadlock, but a very slow progress. In the child can you print the fdmax and i in the frame do_child. George. On Thu, May 5

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-05 Thread Jeff Squyres (jsquyres) via users
You can use "lldb -p PID" to attach to a running process. -- Jeff Squyres jsquy...@cisco.com From: Scott Sayres Sent: Thursday, May 5, 2022 11:22 AM To: Jeff Squyres (jsquyres) Cc: Open MPI Users Subject: Re: [OMPI users] mpirun hangs on m1 mac

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-05 Thread Jeff Squyres (jsquyres) via users
- Jeff Squyres jsquy...@cisco.com From: Scott Sayres Sent: Wednesday, May 4, 2022 4:02 PM To: Jeff Squyres (jsquyres) Cc: Open MPI Users Subject: Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3 foo.sh is executable, again hangs without output. I co

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-04 Thread Jeff Squyres (jsquyres) via users
From: Scott Sayres Sent: Wednesday, May 4, 2022 2:47 PM To: Open MPI Users Cc: Jeff Squyres (jsquyres) Subject: Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3 Following Jeff's advice, I have rebuilt open-mpi by hand using the -g option. This shows more information as below

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-04 Thread Jeff Squyres (jsquyres) via users
[scotts-mbp.3500.dhcp.###:05469] [[48286,0],0] Releasing job data for [INVALID] Can you recommend a way to find where mpirun gets stuck? Thanks! Scott On Wed, May 4, 2022 at 6:06 AM Jeff Squyres (jsquyres) mailto:jsquy...@cisco.com>> wrote: Are you able to use mpirun to launch a non

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-04 Thread Jeff Squyres (jsquyres) via users
Are you able to use mpirun to launch a non-MPI application? E.g.: mpirun -np 2 hostname And if that works, can you run the simple example MPI apps in the "examples" directory of the MPI source tarball (the "hello world" and "ring" programs)? E.g.: cd examples make mpirun -np 4 hello_c

Re: [OMPI users] help with M1 chip macOS openMPI installation

2022-04-22 Thread Jeff Squyres (jsquyres) via users
ary translator. We even had some discussions about this, on the mailing list (or github issues). 3. Based on your original message, and their webpage, MARE2DEM is not supporting any other compilation chain than Intel. As explained above, that might not be by itself a showstopper, because you

Re: [OMPI users] Help diagnosing MPI+OpenMP application segmentation fault only when run with --bind-to none

2022-04-22 Thread Jeff Squyres (jsquyres) via users
With THREAD_FUNNELED, it means that there can only be one thread in MPI at a time -- and it needs to be the same thread as the one that called MPI_INIT_THREAD. Is that the case in your app? Also, what is your app doing at src/pcorona_main.f90:627? It is making a call to MPI, or something

Re: [OMPI users] help with M1 chip macOS openMPI installation

2022-04-21 Thread Jeff Squyres (jsquyres) via users
A little more color on Gilles' answer: I believe that we had some Open MPI community members work on adding M1 support to Open MPI, but Gilles is absolutely correct: the underlying compiler has to support the M1, or you won't get anywhere. -- Jeff Squyres jsquy...@cisco.com

Re: [OMPI users] mixed OpenMP/MPI

2022-03-15 Thread Jeff Squyres (jsquyres) via users
Thanks for the poke! Sorry we missed replying to your github issue. Josh replied to it this morning. -- Jeff Squyres jsquy...@cisco.com From: users on behalf of Bernstein, Noam CIV USN NRL (6393) Washington DC (USA) via users Sent: Tuesday, March

Re: [OMPI users] handle_wc() in openib and IBV_WC_DRIVER2/MLX5DV_WC_RAW_WQE completion code

2022-02-23 Thread Jeff Squyres (jsquyres) via users
The short answer is likely that UCX and Open MPI v4.1.x is your way forward. openib has basically been unmaintained for quite a while -- Nvidia (Mellanox) made it quite clear long ago that UCX was their path forward. openib was kept around until UCX became stable enough to become the preferred

Re: [OMPI users] Unknown breakdown (Transport retry count exceeded on mlx5_0:1/IB)

2022-02-23 Thread Jeff Squyres (jsquyres) via users
I can't comment much on UCX; you'll need to ask Nvidia for support on that. But transport retry count exceeded errors mean that the underlying IB network tried to send a message a bunch of times but never received the corresponding ACK from the receiver indicating that the receiver successfully

Re: [OMPI users] Trouble compiling OpenMPI with Infiniband support

2022-02-23 Thread Jeff Squyres (jsquyres) via users
I'd recommend against using Open MPI v3.1.0 -- it's quite old. If you have to use Open MPI v3.1.x, I'd at least suggest using v3.1.6, which has all the rolled-up bug fixes on the v3.1.x series. That being said, Open MPI v4.1.2 is the most current. Open MPI v4.1.2 does restrict which versions

Re: [OMPI users] Building Open MPI without zlib: what might go wrong/different?

2022-01-31 Thread Jeff Squyres (jsquyres) via users
It's used for compressing the startup time messages in PMIx. I.e., the traffic for when you "mpirun ...". It's mostly beneficial when launching very large MPI jobs. If you're only launching across several nodes, the performance improvement isn't really noticeable. -- Jeff Squyres

Re: [OMPI users] RES: OpenMPI - Intel MPI

2022-01-27 Thread Jeff Squyres (jsquyres) via users
This is part of the challenge of HPC: there are general solutions, but no specific silver bullet that works in all scenarios. In short: everyone's setup is different. So we can offer advice, but not necessarily a 100%-guaranteed solution that will work in your environment. In general, we

Re: [OMPI users] Gadget2 error 818 when using more than 1 process?

2022-01-27 Thread Jeff Squyres (jsquyres) via users
From: users on behalf of Diego Zuccato via users Sent: Wednesday, January 26, 2022 2:06 AM To: users@lists.open-mpi.org Cc: Diego Zuccato Subject: Re: [OMPI users] Gadget2 error 818 when using more than 1 process? Il 26/01/2022 02:10, Jeff Squyres (jsquyres) via

Re: [OMPI users] Gadget2 error 818 when using more than 1 process?

2022-01-25 Thread Jeff Squyres (jsquyres) via users
I'm afraid I don't know anything about Gadget, so I can't comment there. How exactly does the application fail? Can you try upgrading to Open MPI v4.1.2? What networking are you using? -- Jeff Squyres jsquy...@cisco.com From: users on behalf of Diego

Re: [OMPI users] NAG Fortran 2018 bindings with Open MPI 4.1.2

2022-01-04 Thread Jeff Squyres (jsquyres) via users
fixed everything yet. -- Jeff Squyres jsquy...@cisco.com From: users on behalf of Paul Kapinos via users Sent: Tuesday, January 4, 2022 4:27 AM To: Jeff Squyres (jsquyres) via users Cc: Paul Kapinos Subject: Re: [OMPI users] NAG Fortran 2018 bindings with Open MPI

Re: [OMPI users] NAG Fortran 2018 bindings with Open MPI 4.1.2

2021-12-30 Thread Jeff Squyres (jsquyres) via users
From: users on behalf of Jeff Squyres (jsquyres) via users Sent: Thursday, December 30, 2021 4:39 PM To: Matt Thompson Cc: Jeff Squyres (jsquyres); Open MPI Users Subject: Re: [OMPI users] NAG Fortran 2018 bindings with Open MPI 4.1.2 Sweet; thanks! The top-level Fortran test is here: https

Re: [OMPI users] NAG Fortran 2018 bindings with Open MPI 4.1.2

2021-12-30 Thread Jeff Squyres (jsquyres) via users
, you should be able to find the corresponding .m4 file for the test source code. --  Jeff Squyres jsquy...@cisco.com From: Matt Thompson Sent: Thursday, December 30, 2021 4:01 PM To: Jeff Squyres (jsquyres) Cc: Wadud Miah; Open MPI Users Subject: Re

Re: [OMPI users] Mac OS + openmpi-4.1.2 + intel oneapi

2021-12-30 Thread Jeff Squyres (jsquyres) via users
PM To: Jeff Squyres (jsquyres) Cc: Open MPI Users; Christophe Peyret Subject: Re: [OMPI users] Mac OS + openmpi-4.1.2 + intel oneapi Jeff, I'm not sure it'll happen. For understandable reasons (for Intel), I think Intel is not putting too much emphasis on supporting macOS. I guess since I had

Re: [OMPI users] NAG Fortran 2018 bindings with Open MPI 4.1.2

2021-12-30 Thread Jeff Squyres (jsquyres) via users
Snarky comments from the NAG tech support people aside, if they could be a little more specific about what non-conformant Fortran code they're referring to, we'd be happy to work with them to get it fixed. I'm one of the few people in the Open MPI dev community who has a clue about Fortran,

Re: [OMPI users] Mac OS + openmpi-4.1.2 + intel oneapi

2021-12-30 Thread Jeff Squyres (jsquyres) via users
The conclusion we came to on that issue was that this was an issue with Intel ifort. Was anyone able to raise this with Intel ifort tech support? -- Jeff Squyres jsquy...@cisco.com From: users on behalf of Matt Thompson via users Sent: Thursday,

Re: [OMPI users] stdout scrambled in file

2021-12-07 Thread Jeff Squyres (jsquyres) via users
Open MPI launches a single "helper" process on each node (in Open MPI <= v4.x, that helper process is called "orted"). This process is responsible for launching all the individual MPI processes, and it's also responsible for capturing all the stdout/stderr from those processes and sending it

Re: [OMPI users] stdout scrambled in file

2021-12-05 Thread Jeff Squyres (jsquyres) via users
FWIW: Open MPI 4.1.2 has been released -- you can probably stop using an RC release. I think you're probably running into an issue that is just a fact of life. Especially when there's a lot of output simultaneously from multiple MPI processes (potentially on different nodes), the

  1   2   3   4   5   6   7   8   9   10   >