[OMPI users] Open MPI SC'21 SotU BOF slides

2021-11-16 Thread Jeff Squyres (jsquyres) via users
Thanks to everyone who attended today's Open MPI State of the Union BOF session! We started the recording a few minutes late (sorry about that!), but if you're watching the video, know that you only missed an intro slide or two. I *believe* that the SC people told us that the Zoom video

Re: [OMPI users] Open MPI State of the Union BOF at SC'21

2021-11-15 Thread Jeff Squyres (jsquyres) via users
On Nov 15, 2021, at 2:20 PM, Jeff Squyres (jsquyres) via users mailto:users@lists.open-mpi.org>> wrote: SC'21 is hybrid this year; all the academic sessions and BOF are available to online attendees. We hope you'll join us tomorrow, 16 Nov 2021, at 1:15pm US Eastern time ... for the

[OMPI users] Open MPI State of the Union BOF at SC'21

2021-11-15 Thread Jeff Squyres (jsquyres) via users
SC'21 is hybrid this year; all the academic sessions and BOF are available to online attendees. We hope you'll join us tomorrow, 16 Nov 2021, at 1:15pm US Eastern time / 2:15pm US Central time (i.e., local SC'21 time) for the annual Open MPI State of the Union Birds of a Feather (BOF) session.

Re: [OMPI users] Newbie Question.

2021-11-01 Thread Jeff Squyres (jsquyres) via users
Gilles' question is correct; the larger point to make here: the openib BTL is obsolete and is actively replaced by the UCX PML. UCX is supported by the vendor (NVIDIA); openib is not. If you're just starting a new project, I would strongly advocate using UCX instead of openib. On Nov 1,

Re: [OMPI users] OpenMPI 4.1.1, CentOS 7.9, nVidia HPC-SDk, build hints?

2021-09-29 Thread Jeff Squyres (jsquyres) via users
Ray -- Looks like this is a dup of https://github.com/open-mpi/ompi/issues/8919. On Sep 29, 2021, at 10:47 AM, Ray Muno via users mailto:users@lists.open-mpi.org>> wrote: Looking to compile OpenMPI 4.1.1 under Centos 7.9, (with Mellanox OFED 5.3 stack) with the nVidia HPC-SDK, version

Re: [OMPI users] Question about MPI_T

2021-08-19 Thread Jeff Squyres (jsquyres) via users
This appears to be a legit bug with the use of MPI_T in the test/example monitoring app, so I'm going to move the discussion to the Github issue so that we can track it properly: https://github.com/open-mpi/ompi/issues/9260 To answer Jong's question: ob1 is one of Open MPI's point-to-point

Re: [OMPI users] orte-clean errors out on Ubuntu 20.04

2021-07-21 Thread Jeff Squyres (jsquyres) via users
Yeah, it looks like orte-clean is busted in 4.0.x and 4.1.x. I have filed https://github.com/open-mpi/ompi/issues/9171 to track the issue. FWIW, orte-clean shouldn't be necessary for normal Open MPI runs. We should fix it (or remove it), of course, but this shouldn't be a blocker for whatever

Re: [OMPI users] Bitset in OpenMPI

2021-06-18 Thread Jeff Squyres (jsquyres) via users
On Jun 18, 2021, at 10:30 AM, Mehdi Jenab via users mailto:users@lists.open-mpi.org>> wrote: I wonder if there is any way to send and receive a bitset in OpenMPI without converting it to integer. It depends on what type your bitset is. If it's a constant memory layout (e.g., a fixed

Re: [OMPI users] Linker errors in Fedora 34 Docker container

2021-05-25 Thread Jeff Squyres (jsquyres) via users
John -- Can you supply all the information requested here: https://www.open-mpi.org/community/help/ Thanks. On May 25, 2021, at 5:44 PM, John Haiducek via users mailto:users@lists.open-mpi.org>> wrote: Hi, When attempting to build OpenMPI in a Fedora 34 Docker image I get the

Re: [OMPI users] Is there a version of Open MPI which supports IP interfaces that have more than one IP address?

2021-05-17 Thread Jeff Squyres (jsquyres) via users
Sharon -- I replied to your comment on the corresponding GitHub issue: https://github.com/open-mpi/ompi/issues/5818#issuecomment-842531001. On May 14, 2021, at 1:12 PM, Sharon Brunett via users mailto:users@lists.open-mpi.org>> wrote: Does Open MPI v 4.1.0 or v 4.1.1 support using IP

Re: [OMPI users] [Help] Must orted exit after all spawned proecesses exit

2021-05-17 Thread Jeff Squyres (jsquyres) via users
FYI: general Open MPI questions are better sent to the user's mailing list. Up through the v4.1.x series, the "orted" is a general helper process that Open MPI uses on the back-end. It will not quit until all of its children have died. Open MPI's run time is designed with the intent that some

Re: [OMPI users] MCA parameter to disable OFI?

2021-04-20 Thread Jeff Squyres (jsquyres) via users
You can rm $prefix/lib/openmpi/mca_btl_openib* Or set the MCA param "btl" to "^openib" (which means: not openib) -- either in the environment, on the command line, or in the global config file: https://www.open-mpi.org/faq/?category=tuning#setting-mca-params > On Apr 20, 2021, at 2:33 PM,

Re: [OMPI users] Configure failure for thread support in libevent

2021-04-12 Thread Jeff Squyres (jsquyres) via users
It's not really a problem, per se -- it's that Open MPI needs a lib event with thread support enabled. The lib event that configure found / tried to use did not have thread support enabled. What was the command line you used to invoke configure? > On Apr 11, 2021, at 1:52 PM, John Hearns via

Re: [OMPI users] Building Open-MPI with Intel C

2021-04-07 Thread Jeff Squyres (jsquyres) via users
rnelis Networks. On Wed, 7 Apr 2021 at 14:59, Jeff Squyres (jsquyres) mailto:jsquy...@cisco.com>> wrote: Check the output from ldd in a non-interactive login: your LD_LIBRARY_PATH probably doesn't include the location of the Intel runtime. E.g. ssh othernode ldd /path/to/orted Your s

Re: [OMPI users] Building Open-MPI with Intel C

2021-04-07 Thread Jeff Squyres (jsquyres) via users
Check the output from ldd in a non-interactive login: your LD_LIBRARY_PATH probably doesn't include the location of the Intel runtime. E.g. ssh othernode ldd /path/to/orted Your shell startup files may well differentiate between interactive and non-interactive logins (i.e., it may set

[OMPI users] Open MPI State of the Union BOF (webinar)

2021-03-30 Thread Jeff Squyres (jsquyres) via users
Thanks to all who attended today. The slides from the presentation are now available here: https://www-lb.open-mpi.org/papers/ecp-bof-2021/ > On Mar 29, 2021, at 2:52 PM, Jeff Squyres (jsquyres) via announce > wrote: > > Gentle reminder that the Open MPI State of the Union

Re: [OMPI users] Open MPI State of the Union BOF (webinar)

2021-03-29 Thread Jeff Squyres (jsquyres) via users
of time! Hope to see you there! > On Mar 15, 2021, at 1:06 PM, Jeff Squyres (jsquyres) > wrote: > > In conjunction with the Exascale Computing Project (ECP), George Bosilca, > Jeff Squyres, and members of the Open MPI community will present the current > status and future ro

[OMPI users] Open MPI State of the Union BOF (webinar)

2021-03-15 Thread Jeff Squyres (jsquyres) via users
In conjunction with the Exascale Computing Project (ECP), George Bosilca, Jeff Squyres, and members of the Open MPI community will present the current status and future roadmap for the Open MPI project. We typically have an Open MPI "State of the Union" BOF at the annual Supercomputing

Re: [OMPI users] Stable and performant openMPI version for Ubuntu20.04 ?

2021-03-12 Thread Jeff Squyres (jsquyres) via users
On Mar 12, 2021, at 1:20 AM, Raut, S Biplab via users mailto:users@lists.open-mpi.org>> wrote: Reposting here without the logs – it seems there is a message size limit of 150KB and so could not attach the logs. (Request the moderator to approve the original mail that has attachment of

Re: [OMPI users] config: gfortran: "could not run a simple Fortran program"

2021-03-08 Thread Jeff Squyres (jsquyres) via users
What is the exact configure line you used to build Open MPI? You don't want to put CC and CXX in a single quoted token. For example, do this: ./configure CC=gcc CXX=g++ ... Don't do this (which is what your previous mail implied you might be doing...?): ./configure "CC=gcc CXX=g++" ... On

Re: [OMPI users] MPI executable fails on ArchLinux on Termux

2021-02-25 Thread Jeff Squyres (jsquyres) via users
want to know why OpenMPI is not functioning as expected. On Thu, 25 Feb 2021, 20:17 Jeff Squyres (jsquyres), mailto:jsquy...@cisco.com>> wrote: I don't know how many people on this list will be familiar with Termux or Arch Linux. From a quick Google, it looks like Termux is an Android e

Re: [OMPI users] MPI executable fails on ArchLinux on Termux

2021-02-25 Thread Jeff Squyres (jsquyres) via users
I don't know how many people on this list will be familiar with Termux or Arch Linux. From a quick Google, it looks like Termux is an Android emulator (that runs on Android? That doesn't make sense to me, but I'm wholly unfamiliar with that ecosystem, so I don't have the background / grok the

Re: [OMPI users] Status of v4.1.1

2021-02-22 Thread Jeff Squyres (jsquyres) via users
4.1.1 is due "very soon" -- there's two outstanding blocker UCX issues that are awaiting resolution. That being said, you can certainly start with v4.1.0 today and upgrade to v4.1.1 when it comes out. 4.1.1 will be ABI and backwards compatible with v4.1.0.. On Feb 21, 2021, at 11:40 AM,

Re: [OMPI users] weird mpi error report: Type mismatch between arguments

2021-02-17 Thread Jeff Squyres (jsquyres) via users
On Feb 17, 2021, at 12:24 PM, Luis Diego Pinto via users wrote: > > Dear Christoph Niethammer, > > thank you very much for the advice, now i have imported the module "mpi_f08" > and indeed those issues are fixed. Exxxcellent. mpi_f08 is definitely the way to go for all new Fortran MPI

Re: [OMPI users] mpi_f08_types_issue

2021-02-16 Thread Jeff Squyres (jsquyres) via users
--Original Message- >> From: users On Behalf Of Jeff Squyres >> (jsquyres) via users >> Sent: Tuesday, February 16, 2021 10:55 AM >> To: Open MPI User's List >> Cc: Jeff Squyres (jsquyres) >> Subject: [EXTERNAL] Re: [OMPI users] mpi_f08_types_issue >> >

Re: [OMPI users] mpi_f08_types_issue

2021-02-16 Thread Jeff Squyres (jsquyres) via users
mpi_f08_types.mod -- to fail, thereby causing the overall Open MPI compilation to fail. Weird. > On Feb 12, 2021, at 10:37 AM, Jeff Squyres (jsquyres) via users > wrote: > > Can you please send all the information listed here: > > https://www.open-mpi.org/community/help/ > &

Re: [OMPI users] mpi_f08_types_issue

2021-02-12 Thread Jeff Squyres (jsquyres) via users
Can you please send all the information listed here: https://www.open-mpi.org/community/help/ Thanks! On Feb 12, 2021, at 7:39 AM, Yonas Mersha via users mailto:users@lists.open-mpi.org>> wrote: Hi everyone, I am trying to install openmpi-4.1.0 with the FORTRAN support for my project

Re: [OMPI users] OpenMPI 4.1.0 misidentifies x86 capabilities

2021-02-11 Thread Jeff Squyres (jsquyres) via users
On Feb 11, 2021, at 11:55 AM, Max R. Dechantsreiter wrote: > > Not sure about the latest, but I built v4.1.x-202102090356-380ac96 > without errors, then used that to successfully build andd test > GROMACS parallel mdrun. Great! > While I do not like using non-release versions, this is

Re: [OMPI users] GROMACS with openmpi

2021-02-11 Thread Jeff Squyres (jsquyres) via users
You might want to ask your question to the Gromacs community -- they can probably be more helpful than we can (we know little / nothing about Gromacs). Good luck! On Feb 11, 2021, at 10:46 AM, Wenhao Yao via users mailto:users@lists.open-mpi.org>> wrote: Thanks for helping me though it a

Re: [OMPI users] OpenMPI 4.1.0 misidentifies x86 capabilities

2021-02-11 Thread Jeff Squyres (jsquyres) via users
ht.us >> # Hello from thread 1 out of 2 from process 0 out of 2 on >> server.clearlight.us >> # Hello from thread 0 out of 2 from process 1 out of 2 on >> server.clearlight.us >> # Hello from thread 1 out of 2 from process 1 out of 2 on >> server.clearlight.us

Re: [OMPI users] OpenMPI 4.1.0 misidentifies x86 capabilities

2021-02-10 Thread Jeff Squyres (jsquyres) via users
I think Max did try the latest 4.1 nightly build (from an off-list email), and his problem still persisted. Max: can you describe exactly how Open MPI failed? All you said was: >> Consequently AVX512 intrinsic functions were erroneously >> deployed, resulting in OpenMPI failure. Can you

Re: [OMPI users] 4.1.0 build failure with --enable-cxx-exceptions

2021-02-05 Thread Jeff Squyres (jsquyres) via users
Carl -- I'm not able to reproduce your error. What does config.log say? On Feb 5, 2021, at 2:36 PM, Carl Ponder via users mailto:users@lists.open-mpi.org>> wrote: I usually build OpenMPI using these flags --enable-mpi-cxx --enable-mpi-cxx-seek --enable-cxx-exceptions but get this error for

Re: [OMPI users] AVX errors building OpenMPI 4.1.0

2021-02-05 Thread Jeff Squyres (jsquyres) via users
Actually, can you try with the latest v4.1.x nightly snapshot tarball? We have fixed several AVX-related issues since v4.1.0 was released: https://www.open-mpi.org/nightly/v4.1.x/ On Feb 5, 2021, at 2:54 PM, George Bosilca via users mailto:users@lists.open-mpi.org>> wrote: Carl, AVX

Re: [OMPI users] OMPI 4.1 in Cygwin packages?

2021-02-04 Thread Jeff Squyres (jsquyres) via users
Do we know if this was definitely fixed in v4.1.x? > On Feb 4, 2021, at 7:46 AM, Gilles Gouaillardet via users > wrote: > > Martin, > > this is a connectivity issue reported by the btl/tcp component. > > You can try restricting the IP interface to a subnet known to work > (and with no

Re: [OMPI users] OpenMPI version compatibility with libfabric, UCX, etc...

2021-02-01 Thread Jeff Squyres (jsquyres) via users
On Jan 26, 2021, at 3:03 PM, Craig via users mailto:users@lists.open-mpi.org>> wrote: Is there a table somewhere that tells me what version of things like libfabric and UCX (and maybe compiler versions if there are known issues) are known to be good with which versions of OpenMPI? I poked

Re: [OMPI users] Issues with compilers

2021-02-01 Thread Jeff Squyres (jsquyres) via users
he configure execution, but when the system tries to compile orte-info it crashes. I enclose the outputs again. configure.out contains both the outputs from configure and compilation. Kind regards, Álvaro El vie, 22 ene 2021 a las 16:09, Jeff Squyres (jsquyres) (mailto:jsquy...@cisco.com>>) esc

Re: [OMPI users] High errorcode message

2021-01-31 Thread Jeff Squyres (jsquyres) via users
Is your app calling MPI_Abort directly? There's a 2nd argument to MPI_ABORT that should be passed to the output message. If it's not, we should investigate that. Or is your app aborting in some other, indirect method? If so, perhaps somehow that 2nd argument is getting dropped somewhere

Re: [OMPI users] High errorcode message

2021-01-29 Thread Jeff Squyres (jsquyres) via users
It's somewhat hard to say without more information. What is your app doing when it calls abort? On Jan 29, 2021, at 8:49 PM, Arturo Fernandez via users mailto:users@lists.open-mpi.org>> wrote: Hello, My system is running CentOS8 & OpenMPI v4.1.0. Most stuff is working fine but one app is

Re: [OMPI users] Issues with compilers

2021-01-22 Thread Jeff Squyres (jsquyres) via users
On Jan 22, 2021, at 9:49 AM, Alvaro Payero Pinto via users wrote: > > I am trying to install Open MPI with Intel compiler suite for the Fortran > side and GNU compiler suite for the C side. For factors that don’t depend > upon me, I’m not allowed to change the C compiler suite to Intel one

Re: [OMPI users] bad defaults with ucx

2021-01-14 Thread Jeff Squyres (jsquyres) via users
Good question. I've filed https://github.com/open-mpi/ompi/issues/8379 so that we can track this. > On Jan 14, 2021, at 7:53 AM, Dave Love via users > wrote: > > Why does 4.1 still not use the right defaults with UCX? > > Without specifying osc=ucx, IMB-RMA crashes like 4.0.5. I haven't >

Re: [OMPI users] MPI_type_free question

2020-12-14 Thread Jeff Squyres (jsquyres) via users
Yes, opening an issue would be great -- thanks! On Dec 14, 2020, at 11:32 AM, Patrick Bégou via users mailto:users@lists.open-mpi.org>> wrote: OK, Thanks Gilles. Does it still require that I open an issue for tracking ? Patrick Le 14/12/2020 à 14:56, Gilles Gouaillardet via users a écrit :

Re: [OMPI users] help

2020-12-14 Thread Jeff Squyres (jsquyres) via users
On Dec 12, 2020, at 4:58 AM, Lesiano 16 via users wrote: > > My question is, can I assume that when skipping the beginning of the file > that MPI will fill up with zeros? Or is it implementation dependent? > > I have read the standard, but I could not found anything meaningful expected >

Re: [OMPI users] calling mpiexec with std::system in WSL2

2020-11-30 Thread Jeff Squyres (jsquyres) via users
I have not personally tried Open MPI in WSL / WSL2. Two suggestions: 1. Can you post the output of the error that occurs? 2. Can you try upgrading to the latest 4.0.x release, or at least the latest v3.1.x release (as of this writing v3.1.6)? > On Nov 27, 2020, at 10:50 AM, schoon via users

Re: [OMPI users] 4.0.5 on Linux Pop!_OS

2020-11-09 Thread Jeff Squyres (jsquyres) via users
Paul -- The help message describes 4 ways to set the number of available slots on your machine: 1. Hostfile, via "slots=N" clauses (N defaults to number of processor cores if not provided) 2. The --host command line parameter, via a ":N" suffix on the hostname (N defaults to 1 if not

[OMPI users] Continued Open MPI mailing lists instability

2020-11-05 Thread Jeff Squyres (jsquyres) via users
Over the past month or so, you may have noticed that sometimes mails you sent to the Open MPI mailing lists were delayed, sometimes by multiple days. Our mailing list provider has been experiencing technical difficulties in keeping the Open MPI mailing lists flowing. They tell us that they had

Re: [OMPI users] mpirun on Kubuntu 20.4.1 hangs

2020-10-21 Thread Jeff Squyres (jsquyres) via users
There's huge differences between Open MPI v2.1.1 and v4.0.3 (i.e., years of development effort); it would be very hard to categorize them all; sorry! What happens if you mpirun -np 1 touch /tmp/foo (Yes, you can run non-MPI apps through mpirun) Is /tmp/foo created? (i.e., did the job

[OMPI users] Delays in Open MPI mailing list

2020-10-21 Thread Jeff Squyres (jsquyres) via users
FYI: We've been having some problems with our mailing list provider over the past few weeks. No mails have been lost, but sometimes mails queue up endlessly at our mailing list provider until a human IT staffer goes in, fixes a problem, and effectively releases all the mails that have queued

Re: [OMPI users] Code failing when requesting all "processors"

2020-10-20 Thread Jeff Squyres (jsquyres) via users
On Oct 15, 2020, at 3:27 AM, Diego Zuccato wrote: > >>> The version is 3.1.3 , as packaged in Debian Buster. >> The 3.1.x series is pretty old. If you want to stay in the 3.1.x >> series, you might try upgrading to the latest -- 3.1.6. That has a >> bunch of bug fixes compared to v3.1.3. > I'm

Re: [OMPI users] Code failing when requesting all "processors"

2020-10-19 Thread Jeff Squyres (jsquyres) via users
On Oct 14, 2020, at 3:07 AM, Diego Zuccato mailto:diego.zucc...@unibo.it>> wrote: Il 13/10/20 16:33, Jeff Squyres (jsquyres) ha scritto: That's odd. What version of Open MPI are you using? The version is 3.1.3 , as packaged in Debian Buster. The 3.1.x series is pretty old. If yo

Re: [OMPI users] Code failing when requesting all "processors"

2020-10-13 Thread Jeff Squyres (jsquyres) via users
On Oct 13, 2020, at 10:43 AM, Gus Correa via users wrote: > > Can you use taskid after MPI_Finalize? Yes. It's a variable, just like any other. > Isn't it undefined/deallocated at that point? No. MPI filled it in during MPI_Comm_rank() and then never touched it again. So even though MPI

Re: [OMPI users] Code failing when requesting all "processors"

2020-10-13 Thread Jeff Squyres (jsquyres) via users
That's odd. What version of Open MPI are you using? > On Oct 13, 2020, at 6:34 AM, Diego Zuccato via users > wrote: > > Hello all. > > I have a problem on a server: launching a job with mpirun fails if I > request all 32 CPUs (threads, since HT is enabled) but succeeds if I > only request

Re: [OMPI users] Limiting IP addresses used by OpenMPI

2020-09-01 Thread Jeff Squyres (jsquyres) via users
3.1.2 was a long time ago, but I'm pretty sure that Open MPI v3.1.2 has btl_tcp_if_include / btl_tcp_if_exclude. Try running: "ompi_info --all --parsable | grep btl_tcp_if_" I believe that those options will both take a CIDR notation of which network(s) to use/not use. Note: the _if_include

Re: [OMPI users] Problem in starting openmpi job - no output just hangs

2020-08-25 Thread Jeff Squyres (jsquyres) via users
On Aug 24, 2020, at 9:44 PM, Tony Ladd wrote: > > I appreciate your help (and John's as well). At this point I don't think is > an OMPI problem - my mistake. I think the communication with RDMA is somehow > disabled (perhaps its the verbs layer - I am not very knowledgeable with > this). It

Re: [OMPI users] Problem in starting openmpi job - no output just hangs

2020-08-24 Thread Jeff Squyres (jsquyres) via users
fail. >> >>The only good (as best I can tell) diagnostic is from openMPI. >>ibv_obj >>(from v2.x) complains that openib returns a NULL object, whereas >>on my >>server it returns logical_index=1. Can we not try to diagnose the >>proble

Re: [OMPI users] Problem in starting openmpi job - no output just hangs

2020-08-19 Thread Jeff Squyres (jsquyres) via users
Tony -- Have you tried compiling Open MPI with UCX support? This is Mellanox (NVIDIA's) preferred mechanism for InfiniBand support these days -- the openib BTL is legacy. You can run: mpirun --mca pml ucx ... > On Aug 19, 2020, at 12:46 PM, Tony Ladd via users > wrote: > > One other

Re: [OMPI users] MPI is still dominantparadigm?

2020-08-08 Thread Jeff Squyres (jsquyres) via users
On Aug 7, 2020, at 12:52 PM, Oddo Da via users mailto:users@lists.open-mpi.org>> wrote: The Java bindings support "recent" JDK, and if you face an issue, please report a bug (either here or on github) Starting with Java 8, the language went into a much different direction - functional

Re: [OMPI users] Books/resources to learn (open)MPI from

2020-08-06 Thread Jeff Squyres (jsquyres) via users
FWIW, we didn't talk too much about the internals of Open MPI -- but it's a good place to start (i.e., you won't understand the internals until you understand the externals). You can find all the videos and slides for all 3 parts here: https://www.open-mpi.org/video/?category=general In

Re: [OMPI users] choosing network: infiniband vs. ethernet

2020-07-22 Thread Jeff Squyres (jsquyres) via users
checking whether ucp_tag_send_sync_nbx is declared... no checking whether ucp_tag_recv_nbx is declared... no checking for ucp_request_param_t... no configure: error: UCX support requested but not found. Aborting .. Lana (lana.de...@gmail.com<mailto:lana.de...@gmail.com>) On Mon, Jul

Re: [OMPI users] choosing network: infiniband vs. ethernet

2020-07-20 Thread Jeff Squyres (jsquyres) via users
network transports automatically based on what's available. I'll also look at the slides and see if I can make sense of them. Thanks. .. Lana (lana.de...@gmail.com<mailto:lana.de...@gmail.com>) On Sat, Jul 18, 2020 at 9:41 AM Jeff Squyres (jsquyres) mailto:jsquy...@cisco.com>> wrot

Re: [OMPI users] choosing network: infiniband vs. ethernet

2020-07-18 Thread Jeff Squyres (jsquyres) via users
On Jul 16, 2020, at 2:56 PM, Lana Deere via users mailto:users@lists.open-mpi.org>> wrote: I am new to open mpi. I built 4.0.4 on a CentOS7 machine and tried doing an mpirun of a small program compiled against openmpi. It seems to have failed because my host does not have infiniband. I

Re: [OMPI users] choosing network: infiniband vs. ethernet

2020-07-18 Thread Jeff Squyres (jsquyres) via users
Woo hoo! I love getting emails like this. We actually spend quite a bit of time in the design and implementation of the configure/build system so that it will "just work" in a wide variety of situations. Thanks! On Jul 17, 2020, at 5:43 PM, John Duffy via users

Re: [OMPI users] MTU Size and Open-MPI/HPL Benchmark

2020-07-15 Thread Jeff Squyres (jsquyres) via users
In addition to what Gilles said, if you're using TCP for your MPI transport, changing the MTU probably won't have a huge impact on HPL. Open MPI will automatically react to the MTU size; there shouldn't be anything you need to change. Indeed, when using TCP, the kernel TCP stack is the one

[OMPI users] The ABCs of Open MPI (parts 1 and 2): slides + videos posted

2020-07-14 Thread Jeff Squyres (jsquyres) via users
The slides and videos for parts 1 and 2 of the online seminar presentation "The ABCs of Open MPI" have been posted on both the Open MPI web site and the EasyBuild wiki: https://www.open-mpi.org/video/?category=general

Re: [OMPI users] Signal code: Non-existant physical address (2)

2020-07-06 Thread Jeff Squyres (jsquyres) via users
Greetings Prentice. This is a very generic error, it's basically just indicating "somewhere in the program, we got a bad pointer address." It's very difficult to know if this issue is in Open MPI or in the application itself (e.g., memory corruption by the application eventually lead to bad

Re: [OMPI users] [Open MPI Announce] Online presentation: the ABCs of Open MPI

2020-07-06 Thread Jeff Squyres (jsquyres) via users
ild-Tech-Talks-I:-Open-MPI Anyone is free to join either / both parts. Hope to see you this Wednesday! On Jun 14, 2020, at 2:05 PM, Jeff Squyres (jsquyres) via announce mailto:annou...@lists.open-mpi.org>> wrote: In conjunction with the EasyBuild community, Ralph Castain (Intel, Open

Re: [OMPI users] Build error On macOS

2020-06-28 Thread Jeff Squyres (jsquyres) via users
is Attached files are stdout and also config.log Thank you! Sahir -- On 27. Jun 2020, at 17:48, Jeff Squyres (jsquyres) mailto:jsquy...@cisco.com>> wrote: On Jun 26, 2020, at 11:32 AM, Sahir Butt via users mailto:users@lists.open-mpi.org>> wrote: I am trying to build openmpi-4.

Re: [OMPI users] Unable to run MPI application

2020-06-27 Thread Jeff Squyres (jsquyres) via users
On Jun 26, 2020, at 7:30 AM, Peter Kjellström via users wrote: > >> The cluster hardware is QLogic infiniband with Intel CPUs. My >> understanding is that we should be using the old PSM for networking. >> >> Any thoughts what might be going wrong with the build? > > Yes only PSM will

Re: [OMPI users] Build error On macOS

2020-06-27 Thread Jeff Squyres (jsquyres) via users
On Jun 26, 2020, at 11:32 AM, Sahir Butt via users wrote: > > I am trying to build openmpi-4.0.3 with gcc 11 on macOS. I ran following to > configure: > > ../configure --prefix=/path-to/opt/openmpi-4.0.3 > --with-wrapper-ldflags="-Wl,-search_paths_first” > > I keep getting following error:

Re: [OMPI users] Question about virtual interface

2020-06-23 Thread Jeff Squyres (jsquyres) via users
akes it very difficult to > specify IP address for btl_tcp_if_include. > > For the named exclude interfaces, it still hanged (with no output) when I > specified btl_base_verbose 100. > > I will try using the CIDR for the below hosts as an experiment. > > Regards, > Vipul >

Re: [OMPI users] Question about virtual interface

2020-06-23 Thread Jeff Squyres (jsquyres) via users
https://www.open-mpi.org/faq/?category=tcp#ip-virtual-ip-interfaces is referring to interfaces like "eth0:0", where the Linux kernel will have the same index for both "eth0" and "eth0:0". This will cause Open MPI to get confused (because it identifies Ethernet interfaces by their kernel

Re: [OMPI users] [Open MPI Announce] Online presentation: the ABCs of Open MPI

2020-06-22 Thread Jeff Squyres (jsquyres) via users
! > On Jun 14, 2020, at 2:05 PM, Jeff Squyres (jsquyres) via announce > wrote: > > In conjunction with the EasyBuild community, Ralph Castain (Intel, Open MPI, > PMIx) and Jeff Squyres (Cisco, Open MPI) will host an online presentation > about Open MPI on **Wednesday June 24th 2020

Re: [OMPI users] OpenMPI compiles, but openib BTL hangs on trivial jobs

2020-06-20 Thread Jeff Squyres (jsquyres) via users
On Jun 19, 2020, at 6:59 PM, Thomas M. Payerle via users wrote: > > We are upgrading a cluster from RHEL6 to RHEL8, and have migrated some nodes > to a new partition and reimaged with RHEL8. I am having some issues getting > openmpi to work with infiniband on the nodes upgraded to RHEL8.

Re: [OMPI users] Command to check which interface OpenMPI is using on a multi-NIC server?

2020-06-16 Thread Jeff Squyres (jsquyres) via users
On Jun 15, 2020, at 3:43 PM, Roberto Herraro via users wrote: > > We have a small cluster and are running paired HPL to test performance, but > are getting poor results. One of our suspicions is that the regular 1GbE > interface might be being used, rather than the 100G interface. Is there a

Re: [OMPI users] The use of __STDC_VERSION__ in mpi.h and C++ (g++ -Wundef)

2020-06-14 Thread Jeff Squyres (jsquyres) via users
Martin -- Someone filed the same issue shortly before your post: https://github.com/open-mpi/ompi/issues/7810 On Jun 12, 2020, at 6:10 PM, Martin Audet via users mailto:users@lists.open-mpi.org>> wrote: Hello OMPI_Developers, When I compile my C++ code with Open MPI version 4.0.3 or

[OMPI users] Online presentation: the ABCs of Open MPI

2020-06-14 Thread Jeff Squyres (jsquyres) via users
In conjunction with the EasyBuild community, Ralph Castain (Intel, Open MPI, PMIx) and Jeff Squyres (Cisco, Open MPI) will host an online presentation about Open MPI on **Wednesday June 24th 2020** at: - 11am US Eastern time - 8am US Pacific time - 3pm UTC - 5pm CEST The general scope of the

Re: [OMPI users] MPI I/O question using MPI_File_write_shared

2020-06-05 Thread Jeff Squyres (jsquyres) via users
You cited Open MPI v2.1.1. That's a pretty ancient version of Open MPI. Any chance you can upgrade to Open MPI 4.0.x? > On Jun 5, 2020, at 7:24 PM, Stephen Siegel wrote: > > > >> On Jun 5, 2020, at 6:55 PM, Jeff Squyres (jsquyres) >> wrote: >> >>

Re: [OMPI users] MPI I/O question using MPI_File_write_shared

2020-06-05 Thread Jeff Squyres (jsquyres) via users
On Jun 5, 2020, at 6:35 PM, Stephen Siegel via users wrote: > > [ilyich:12946] 3 more processes have sent help message help-mpi-btl-base.txt > / btl:no-nics > [ilyich:12946] Set MCA parameter "orte_base_help_aggregate" to 0 to see all > help / error messages It looks like your output somehow

Re: [OMPI users] Problem with open-mpi installation

2020-06-05 Thread Jeff Squyres (jsquyres) via users
Are you actually running into a problem? A successful install may still end with "Nothing to be done..." messages. On Jun 5, 2020, at 10:48 AM, Edris Tajfirouzeh via users mailto:users@lists.open-mpi.org>> wrote: Dear Operator I'm trying to install open-mpi package on my mac catalina version

Re: [OMPI users] Running mpirun with grid

2020-06-01 Thread Jeff Squyres (jsquyres) via users
On top of what Ralph said, I think that this output is unexpected: > Starting server daemon at host "cod5"Starting server daemon at host > "cod6"Starting server daemon at host "has4"Starting server daemon at host "co > d4" > > > > Starting server daemon at host "hpb12"Starting server daemon

Re: [OMPI users] Error with MPI_GET_ADDRESS and MPI_TYPE_CREATE_RESIZED?

2020-05-18 Thread Jeff Squyres (jsquyres) via users
Was your Open MPI compiled with -r8? We definitely recommend using the same compiler flags to compile Open MPI as your application. As George noted, -r8 can definitely cause issues if one was compiled with it and the other was not. On May 18, 2020, at 12:24 AM, George Bosilca via users

Re: [OMPI users] I can't build openmpi 4.0.X using PMIx 3.1.5 to use with Slurm

2020-05-12 Thread Jeff Squyres (jsquyres) via users
On May 12, 2020, at 7:42 AM, Leandro wrote: > > I compile it statically to make sure compilers libraries will not be a > dependency, and I do this way for years. For what it's worth, you're compiling with -static-intel, which should take care of removing the compiler's libraries as

Re: [OMPI users] I can't build openmpi 4.0.X using PMIx 3.1.5 to use with Slurm

2020-05-12 Thread Jeff Squyres (jsquyres) via users
It looks like you are building both static and dynamic libraries (--enable-static and --enable-shared). This might be confusing the issue -- I can see at least one warning: icc: warning #10237: -lcilkrts linked in dynamically, static library not available It's not easy to tell from the

Re: [OMPI users] opal_path_nfs freeze

2020-04-23 Thread Jeff Squyres (jsquyres) via users
On Apr 23, 2020, at 8:50 AM, Patrick Bégou wrote: > > As we say in french "dans le mille!" you were right. > I'm not the admin of these servers and a "mpirun not found" was sufficient in > my mind. It wasn't. > > As I had deployed OpenMPI 4.0.2 I launch a new build after setting my >

Re: [OMPI users] opal_path_nfs freeze

2020-04-22 Thread Jeff Squyres (jsquyres) via users
The test should only take a few moments; no need to let it sit for a full hour. I have seen this kind of behavior before if you have an Open MPI installation in your PATH / LD_LIBRARY_PATH already, and then you invoke "make check". Because the libraries may be the same name and/or .so version

Re: [OMPI users] Hwlock library problem

2020-04-15 Thread Jeff Squyres (jsquyres) via users
Can you send all the information listed here: https://www.open-mpi.org/community/help/ On Apr 15, 2020, at 6:13 AM, フォンスポール J via users mailto:users@lists.open-mpi.org>> wrote: Dear Gilles, Thank you for your advice. I have tried a few of the suggestions that I encountered following

Re: [OMPI users] Regarding eager limit relationship to send message size

2020-03-26 Thread Jeff Squyres (jsquyres) via users
On Mar 26, 2020, at 5:36 AM, Raut, S Biplab wrote: > > I am doing pairwise send-recv and not all-to-all since not all the data is > required by all the ranks. > And I am doing blocking send and recv calls since there are multiple > iterations of such message chunks to be sent with

Re: [OMPI users] Regarding eager limit relationship to send message size

2020-03-25 Thread Jeff Squyres (jsquyres) via users
On Mar 25, 2020, at 4:49 AM, Raut, S Biplab via users wrote: > > Let’s say the application is running with 128 ranks. > Each rank is doing send() msg to rest of 127 ranks where the msg length sent > is under question. > Now after all the sends are completed, each rank will recv() msg from

Re: [OMPI users] Fault in not recycling bsend buffer ?

2020-03-18 Thread Jeff Squyres (jsquyres) via users
Let's back up and ask a question: is there a reason you're using Bsend? I.e., do you need to use Bsend for some reason, or can you use regular (potentially non-buffering) sends instead? On Mar 18, 2020, at 5:16 AM, Martyn Foster via users mailto:users@lists.open-mpi.org>> wrote: Hi George,

Re: [OMPI users] Question about run time message

2020-03-13 Thread Jeff Squyres (jsquyres) via users
> On Mar 13, 2020, at 9:33 AM, Jeffrey Layton via users > wrote: > > Good morning, > > I've compiled a hello world MPI code and when I run it, I get some messages > I'm not familiar with. The first one is, > > -- >

Re: [OMPI users] How to use OPENMPI with different Service Level in Infiniband Virtual Lane?

2020-02-27 Thread Jeff Squyres (jsquyres) via users
If you're using Open MPI v4.0.x, you should likely be using the UCX PML plugin for InfiniBand communication. As I understand it, UCX is controlled by environment variables. You'll likely need to look through the UCX documentation to see what environment variable(s) is(are) needed for setting

Re: [OMPI users] OpenFabrics

2020-02-03 Thread Jeff Squyres (jsquyres) via users
> On Feb 3, 2020, at 12:35 PM, Bennet Fauber wrote: > > This is what CentOS installed. > > $ yum list installed hwloc\* > Loaded plugins: langpacks > Installed Packages > hwloc.x86_64 1.11.8-4.el7 > @os > hwloc-devel.x86_64

Re: [OMPI users] OpenFabrics

2020-02-03 Thread Jeff Squyres (jsquyres) via users
On Feb 3, 2020, at 10:03 AM, Bennet Fauber wrote: > > Ah, ha! > > Yes, that seems to be it. Thanks. Ok, good. I understand that UCX is the "preferred" mechanism for IB these days. > If I might, on a configure related note ask, whether, if we have > these installed with the CentOS 7.6 we

Re: [OMPI users] OpenFabrics

2020-02-02 Thread Jeff Squyres (jsquyres) via users
Bennet -- Just curious: is there a reason you're not using UCX? > On Feb 2, 2020, at 4:06 PM, Bennet Fauber via users > wrote: > > We get these warnings/error from OpenMPI, version 3.1.4 and 4.0.2 > > -- > WARNING: No

Re: [OMPI users] [External] Re: OMPI returns error 63 on AMD 7742 when utilizing 100+ processors per node

2020-01-27 Thread Jeff Squyres (jsquyres) via users
Can you please send all the information listed here: https://www.open-mpi.org/community/help/ Thanks! On Jan 27, 2020, at 12:00 PM, Collin Strassburger via users mailto:users@lists.open-mpi.org>> wrote: Hello, I had initially thought the same thing about the streams, but I have 2

Re: [OMPI users] mpicc fails to compile example code when --enable-static --disable-shared is used for installation.

2020-01-22 Thread Jeff Squyres (jsquyres) via users
Greetings Mehmet. I'm curious: why do you want to use static linking? It tends to cause complications and issues like this. Specifically: Open MPI's `--disable-shared --enable-static` switches means that Open MPI will produce static libraries instead of shared libraries (e.g., libmpi.a

Re: [OMPI users] need a tool and its use to verify use of infiniband network

2020-01-16 Thread Jeff Squyres (jsquyres) via users
In addition to that, let me pile on to what Michael Heinz said (he suggested you use "... --mca btl_base_verbose 100 ..." In your mpirun command line). Short version: Try "mpirun --mca pml_base_verbose 100 --mca btl_base_verbose 100 ..." I.e., add in the bit about pml_base_verbose. If you see

Re: [OMPI users] problems porting PVM to MPI

2020-01-07 Thread Jeff Squyres (jsquyres) via users
On Jan 7, 2020, at 11:48 AM, Pablo Goloboff via users mailto:users@lists.open-mpi.org>> wrote: Hi, I am trying to port an existing PVM-based application (developed over many years) onto MPI, trying to preserve as much of the organization as possible. This is my first contact with MPI, after

Re: [OMPI users] building openmpi-4.0.2 with gfortran

2019-12-13 Thread Jeff Squyres (jsquyres) via users
> On Dec 13, 2019, at 10:08 AM, Tom Rosmond via users > wrote: > > I recently upgraded to Fedora 31 and tried to build openmpi-4.0.2 configured > with > > ./configure FC=gfortran F90=gfortran --prefix=/opt/openmpi --with-slurm > > The configure seemed fine, but the 'make' failed with > >

Re: [OMPI users] mca_oob_tcp_recv_handler: invalid message type: 15

2019-12-06 Thread Jeff Squyres (jsquyres) via users
On Dec 6, 2019, at 1:03 PM, Jeff Squyres (jsquyres) via users wrote: > >> I get the same error when running in a single node. I will try to use the >> last version. Is there way to check if different versions of open mpi were >> used in different nodes? > > mp

Re: [OMPI users] mca_oob_tcp_recv_handler: invalid message type: 15

2019-12-06 Thread Jeff Squyres (jsquyres) via users
On Dec 6, 2019, at 12:40 PM, Guido granda muñoz wrote: > > I get the same error when running in a single node. I will try to use the > last version. Is there way to check if different versions of open mpi were > used in different nodes? mpirun -np 2 ompi_info | head Or something like that.

<    1   2   3   4   5   6   7   8   9   10   >