Re: [OMPI users] mca_oob_tcp_recv_handler: invalid message type: 15

2019-12-05 Thread Jeff Squyres (jsquyres) via users
How did you try to execute your application? An error message like this can mean that you accidentally mixed versions of Open MPI within your run (e.g., used Open MPI va.b.c on node A but used Open MPI vx.y.z on node B). > On Dec 5, 2019, at 5:28 PM, Guido granda muñoz via users > wrote: >

Re: [OMPI users] speed of model is slow with openmpi

2019-12-02 Thread Jeff Squyres (jsquyres) via users
There may also be confusion here between OpenMP and Open MPI -- these are two very different technologies. OpenMP -- compiler-created multi-threaded applications. You put pragmas in your code to tell the compiler how to parallelize your application. Open MPI -- a library for explicit

[OMPI users] Slides from the Open MPI SC'19 BOF

2019-11-21 Thread Jeff Squyres (jsquyres) via users
Thanks to all who came to see the Open MPI State of the Union BOF at SC'19 in Denver yesterday. I have posted the slides on the Open MPI web site -- that may take a little time to propagate out through the CDN to reach everyone, but they should show up soon:

Re: [OMPI users] How can I specify the number of threads for mpirun?

2019-11-14 Thread Jeff Squyres (jsquyres) via users
un, but not when started with MPICH's mpiexec.hydra. The most likely problem is that your "hello" program wasn't built against OMPI - are you trying to run the same binary with both mpirun and mpiexec.hydra? If so, that won't work. On Nov 14, 2019, at 8:58 AM, Jeff Squyres (jsquyre

Re: [OMPI users] How can I specify the number of threads for mpirun?

2019-11-14 Thread Jeff Squyres (jsquyres) via users
Are you asking a question about MPICH? If so, I think you should probably ask on their mailing lists -- they're an entirely different project from Open MPI. Also, I think you mean "processes", not "threads". On Nov 11, 2019, at 5:01 PM, sdcycling via users mailto:users@lists.open-mpi.org>>

Re: [OMPI users] OpenMPI - Job pauses and goes no further

2019-11-13 Thread Jeff Squyres (jsquyres) via users
Agree with Ralph. Your next step is to try what is suggested in the FAQ: run hello_c and ring_c. They are in the examples/ directory in the source tarball. Once Open MPI is installed (and things like "mpicc" can be found in your $PATH), you can just cd in there and run "make" to build them.

Re: [OMPI users] qelr_alloc_context: Failed to allocate context for device.

2019-11-13 Thread Jeff Squyres (jsquyres) via users
Have you tried using the UCX PML? The UCX PML is Mellanox's preferred Open MPI mechanism (instead of using the openib BTL). > On Nov 13, 2019, at 9:35 AM, Matteo Guglielmi via users > wrote: > > I rolled everything back to stock centos 7.7 installing OFED via: > > > > > yum groupinstall

Re: [OMPI users] Change behavior of --output-filename

2019-11-12 Thread Jeff Squyres (jsquyres) via users
On Nov 12, 2019, at 9:17 AM, Ralph Castain via users mailto:users@lists.open-mpi.org>> wrote: The man page is simply out of date - see https://github.com/open-mpi/ompi/issues/7095 for further thinking And https://github.com/open-mpi/ompi/issues/7133 for what might happen going forward. --

Re: [OMPI users] OpenMpi not throwing C++ exceptions

2019-11-07 Thread Jeff Squyres (jsquyres) via users
On Nov 7, 2019, at 4:37 PM, Mccall, Kurt E. (MSFC-EV41) via users wrote: > > Something is odd here, though -- I have two separately compiled OpenMpi > directories, one with and one without Torque support (via the -with-tm > configure flag). Ompi_info chose the one without Torque support.

Re: [OMPI users] OpenMpi not throwing C++ exceptions

2019-11-07 Thread Jeff Squyres (jsquyres) via users
On Nov 7, 2019, at 4:18 PM, Mccall, Kurt E. (MSFC-EV41) via users wrote: > > Ø You need to also set the MPI::ERRORS_THROW_EXCEPTIONS error handler in > your MPI application. > > Thanks Jeff. I double-checked, and yes, I’m calling > MPI_Comm_set_errhandler(com ,

Re: [OMPI users] OpenMpi not throwing C++ exceptions

2019-11-07 Thread Jeff Squyres (jsquyres) via users
On Nov 7, 2019, at 3:02 PM, Mccall, Kurt E. (MSFC-EV41) via users mailto:users@lists.open-mpi.org>> wrote: My program is failing in MPI_Comm_spawn, but it seems to simply terminate the job rather than throwing an exception that I can catch. Here is the abbreviated error message:

Re: [OMPI users] Configure Error for installation of openmpi-1.10.1

2019-11-01 Thread Jeff Squyres (jsquyres) via users
disable 2 plugins for openib; however, I still got the same error. I > attached all the stdout files when I ran configure and make. Thank you. > > Best Regards, > Qianjin > From: Jeff Squyres (jsquyres) > Sent: Friday, November 1, 2019 10:04 AM > To: Qianjin Zheng > Cc: Op

Re: [OMPI users] Configure Error for installation of openmpi-1.10.1

2019-11-01 Thread Jeff Squyres (jsquyres) via users
eng wrote: > > Hi Jeff, > > I attached the stdout from when I ran configure. Thank you > > Regards, > Qianjin > From: Jeff Squyres (jsquyres) > Sent: Thursday, October 31, 2019 1:55 PM > To: Qianjin Zheng > Cc: Open MPI User's List > Subject: Re: [OMPI

Re: [OMPI users] mpirun --output-filename behavior

2019-11-01 Thread Jeff Squyres (jsquyres) via users
On Nov 1, 2019, at 10:14 AM, Reuti mailto:re...@staff.uni-marburg.de>> wrote: For the most part, this whole thing needs to get documented. Especially that the colon is a disallowed character in the directory name. Any suffix :foo will just be removed AFAICS without any error output about foo

Re: [OMPI users] mpirun --output-filename behavior

2019-11-01 Thread Jeff Squyres (jsquyres) via users
On Nov 1, 2019, at 9:34 AM, Jeff Squyres (jsquyres) via users wrote: > >> Point to make: it would be nice to have an option to suppress the output on >> stdout and/or stderr when output redirection to file is requested. In my >> case, having stdout still visible on the

Re: [OMPI users] mpirun --output-filename behavior

2019-11-01 Thread Jeff Squyres (jsquyres) via users
On Oct 31, 2019, at 6:43 PM, Joseph Schuchart via users wrote: > > Just to throw in my $0.02: I recently found that the output to stdout/stderr > may not be desirable: in an application that writes a lot of log data to > stderr on all ranks, stdout was significantly slower than the files I >

Re: [OMPI users] mpirun --output-filename behavior

2019-10-31 Thread Jeff Squyres (jsquyres) via users
On Oct 30, 2019, at 2:16 PM, Kulshrestha, Vipul mailto:vipul_kulshres...@mentor.com>> wrote: Given that this is an intended behavior, I have a couple of follow up questions: 1. What is the purpose of the directory “1” that gets created currently? (in /app.log/1/rank./stdout ) Is this

Re: [OMPI users] Configure Error for installation of openmpi-1.10.1

2019-10-31 Thread Jeff Squyres (jsquyres) via users
On Oct 31, 2019, at 4:17 PM, Qianjin Zheng mailto:qianjin.zh...@hotmail.com>> wrote: I did not see any stdout from when I ran configure. Can you more specify file name? When you run Open MPI's "configure" script, there is a ton of output to stdout. Check out

Re: [OMPI users] Configure Error for installation of openmpi-1.10.1

2019-10-31 Thread Jeff Squyres (jsquyres) via users
mail. Thank you, Qianjin ____ From: Jeff Squyres (jsquyres) mailto:jsquy...@cisco.com>> Sent: Thursday, October 31, 2019 5:21 AM To: Qianjin Zheng mailto:qianjin.zh...@hotmail.com>> Cc: Open MPI User's List mailto:users@lists.open-mpi.org>> Subject

Re: [OMPI users] Configure Error for installation of openmpi-1.10.1

2019-10-31 Thread Jeff Squyres (jsquyres) via users
Please keep users@lists.open-mpi.org in the CC so that other users can benefit from this information. More below. On Oct 30, 2019, at 10:18 PM, Qianjin Zheng mailto:qianjin.zh...@hotmail.com>> wrote: Hi Jeff, I added --enable-no-build=btl:openib on the

Re: [OMPI users] Configure Error for installation of openmpi-1.10.1

2019-10-31 Thread Jeff Squyres (jsquyres) via users
Please keep "users@lists.open-mpi.org" in the CC so that others can Google to find this info in the future. More below. On Oct 30, 2019, at 9:08 PM, Qianjin Zheng mailto:qianjin.zh...@hotmail.com>> wrote: Hi Jeff, Thank you for the suggestions. I will try it.

Re: [OMPI users] Configure Error for installation of openmpi-1.10.1

2019-10-30 Thread Jeff Squyres (jsquyres) via users
___ From: Jeff Squyres (jsquyres) mailto:jsquy...@cisco.com>> Sent: Wednesday, October 30, 2019 7:40 PM To: Open MPI User's List mailto:users@lists.open-mpi.org>> Cc: Qianjin Zheng mailto:qianjin.zh...@hotmail.com>> Subject: Re: [OMPI users] Configure Error for installation of

Re: [OMPI users] Configure Error for installation of openmpi-1.10.1

2019-10-30 Thread Jeff Squyres (jsquyres) via users
v1.10.x is pretty ancient. Is there any chance you can update to 4.0.2? That's the latest version (and it has significantly better MPI_THREAD_MULITPLE support). On Oct 30, 2019, at 8:36 PM, Qianjin Zheng via users mailto:users@lists.open-mpi.org>> wrote: I tried to install openmpi-1.10.1

Re: [OMPI users] mpirun --output-filename behavior

2019-10-29 Thread Jeff Squyres (jsquyres) via users
On Oct 29, 2019, at 7:30 PM, Kulshrestha, Vipul via users mailto:users@lists.open-mpi.org>> wrote: Hi, We recently shifted from openMPI 2.0.1 to 4.0.1 and are seeing an important behavior change with respect to above option. We invoke mpirun as % mpirun –output-filename /app.log –np With

[OMPI users] Open MPI State of the Union BOF at SC'19

2019-10-23 Thread Jeff Squyres (jsquyres) via users
Be sure to come to the Open MPI State of the Union BOF at SC'19 next month! As usual, we'll discuss the current status and future roadmap for Open MPI, answer questions, and generally be available for discussion. The BOF will be in the Wednesday noon hour:

Re: [OMPI users] Parameters at run time

2019-10-21 Thread Jeff Squyres (jsquyres) via users
In addition to what Gilles said, I usually advise users in ambiguous situations to explicitly choose the transport. For example, you might want to explicitly choose using the UCX PML: mpirun --mca pil ucx ... This way, you are 100% sure that Open MPI chose the UCX PML (if it can't choose the

Re: [OMPI users] OpenMPI-4.0.1 ubuntu 18.04 server

2019-10-15 Thread Jeff Squyres (jsquyres) via users
I don't suppose you could upgrade to 4.0.2, could you? We just released 4.0.2 with a ton of good bug fixes. On Oct 15, 2019, at 2:07 PM, Eric F. Alemany via users mailto:users@lists.open-mpi.org>> wrote: Hi, I am using OpenMPI-4.0.1 on a single ubuntu 18.04 server with 64 cores. I compiled

Re: [OMPI users] openmpi-4.0.1 build error

2019-10-03 Thread Jeff Squyres (jsquyres) via users
There may actually be API changes in the UCX library that are causing this issue. Could someone from the UCX community comment on this issue? On Oct 3, 2019, at 1:15 PM, Llolsten Kaonga via users mailto:users@lists.open-mpi.org>> wrote: Hello all, It looks like this build error could be

Re: [OMPI users] UCX errors after upgrade

2019-09-25 Thread Jeff Squyres (jsquyres) via users
On 9/25/19 1:28 PM, Jeff Squyres (jsquyres) via users wrote: Can you try the latest 4.0.2rc tarball? We're very, very close to releasing v4.0.2... I don't know if there's a specific UCX fix in there, but there are a ton of other good bug fixes in there since v4.0.1. On Sep 25, 2019, at 2

Re: [OMPI users] UCX errors after upgrade

2019-09-25 Thread Jeff Squyres (jsquyres) via users
Can you try the latest 4.0.2rc tarball? We're very, very close to releasing v4.0.2... I don't know if there's a specific UCX fix in there, but there are a ton of other good bug fixes in there since v4.0.1. On Sep 25, 2019, at 2:12 PM, Raymond Muno via users mailto:users@lists.open-mpi.org>>

Re: [OMPI users] silent failure for large allgather

2019-09-13 Thread Jeff Squyres (jsquyres) via users
Emmanuel -- Looks like the right people missed this when you posted; sorry about that! We're tracking it now: https://github.com/open-mpi/ompi/issues/6976 On Sep 13, 2019, at 3:04 AM, Emmanuel Thomé via users mailto:users@lists.open-mpi.org>> wrote: Hi, Thanks Jeff for your reply, and

Re: [OMPI users] Floating point overflow and tuning

2019-09-09 Thread Jeff Squyres (jsquyres) via users
On Sep 6, 2019, at 2:17 PM, Logan Stonebraker via users wrote: > > I am working with star ccm+ 2019.1.1 Build 14.02.012 > > CentOS 7.6 kernel 3.10.0-957.21.3.el7.x86_64 > > Intel MPI Version 2018 Update 5 Build 20190404 (this is version shipped with > star ccm+) > > Also trying to make

Re: [OMPI users] **URGENT: Error during testing

2019-08-19 Thread Jeff Squyres (jsquyres) via users
ith-slurm --with-pmix=/usr/local --enable-mpi1-compatibility --with-libevent=/usr/local --with-hwloc=/usr/local making certain linking against the same libevent This is on linux most recent custom kernel, and most recent SLURM scheduler. best: steve On Mon, Aug 19, 2019 at 2:07 PM Jeff Squyres

Re: [OMPI users] **URGENT: Error during testing

2019-08-19 Thread Jeff Squyres (jsquyres) via users
g testing Is there any chance that the fact that Riddhi appears to be trying to execute an uncompiled hello.c could be the problem here? From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of Jeff Squyres (jsquyres) via users Sent: Monday, August 19, 2019 2:05 PM To: Open MPI User's

Re: [OMPI users] **URGENT: Error during testing

2019-08-19 Thread Jeff Squyres (jsquyres) via users
Can you provide some more details? https://www.open-mpi.org/community/help/ On Aug 19, 2019, at 1:18 PM, Riddhi A Mehta via users mailto:users@lists.open-mpi.org>> wrote: Hello My name is Riddhi and I am a Graduate Research Assistant in the Dept. of Physics & Astronomy at Purdue University.

Re: [OMPI users] Error with OpenMPI: Could not resolve generic procedure mpi_irecv

2019-08-19 Thread Jeff Squyres (jsquyres) via users
On Aug 19, 2019, at 6:15 AM, Sangam B via users wrote: > > subroutine recv(this,lmb) > class(some__example6), intent(inout) :: this > integer, intent(in) :: lmb(2,2) > > integer :: cs3, ierr > integer(kind=C_LONG) :: size This ^^ is your problem. More below. > ! receive

Re: [hwloc-users] Netloc feature suggestion

2019-08-16 Thread Jeff Squyres (jsquyres) via hwloc-users
Don't forget that network topologies can also be complex -- it's not always a simple, single-path hierarchy. There can be multiple paths between any pair of hosts on the network. Sometimes the hosts are aware of the multiple paths, sometimes they are not (e.g., sometimes the fabric routing

Re: [OMPI users] Debug OMPI errors

2019-08-05 Thread Jeff Squyres (jsquyres) via users
e the output of > strace for example, so it doesn't make the actual output of the MPI job > harder to read. > > I assume it could be either something enabled during compilation of OMPI > itself, or something passed during runtime (will be better). > > > All the best, &g

Re: [OMPI users] OpenMPI 2.1.1 bug on Ubuntu 18.04.2 LTS

2019-08-02 Thread Jeff Squyres (jsquyres) via users
38 > I don't know how many hours of people's time were wasted on re-discovering > this issue. > --Junchao Zhang > > > On Fri, Aug 2, 2019 at 2:54 PM Jeff Squyres (jsquyres) via users > wrote: > Ah, got it. > > Yes, if I compile with --enable-heterogeneous, the I can

Re: [OMPI users] OpenMPI 2.1.1 bug on Ubuntu 18.04.2 LTS

2019-08-02 Thread Jeff Squyres (jsquyres) via users
via users > wrote: > > Juanchao, > > > Is the issue related to https://github.com/open-mpi/ompi/pull/4501 ? > > > Jeff, > > > you might have to configure with --enable-heterogeneous to evidence the issue > > > > Cheers, > > > Gilles

Re: [OMPI users] OpenMPI 2.1.1 bug on Ubuntu 18.04.2 LTS

2019-08-01 Thread Jeff Squyres (jsquyres) via users
-get install libopenmpi-dev=2.1.6 > Reading package lists... Done > Building dependency tree > Reading state information... Done > E: Version '2.1.6' for 'libopenmpi-dev' was not found > > --Junchao Zhang > > > On Thu, Aug 1, 2019 at 1:15 PM Jeff Squyres (jsquyr

Re: [OMPI users] OpenMPI 2.1.1 bug on Ubuntu 18.04.2 LTS

2019-08-01 Thread Jeff Squyres (jsquyres) via users
Does the bug exist in Open MPI v2.1.6? > On Jul 31, 2019, at 2:19 PM, Zhang, Junchao via users > wrote: > > Hello, > I met a bug with OpenMPI 2.1.1 distributed in the latest Ubuntu 18.04.2 > LTS. It happens with self to self send/recv using MPI_ANY_SOURCE for message > matching. See the

Re: [OMPI users] How to know how OpenMPI was built?

2019-07-30 Thread Jeff Squyres (jsquyres) via users
Run ompi_info. > On Jul 30, 2019, at 5:57 PM, Zhang, Junchao via users > wrote: > > Hello, > On a system with pre-installed OpenMPI, how to know the configure options > used to build OpenMPI (so that I can build from source myself with the same > options)? > Thanks > > --Junchao Zhang

Re: [OMPI users] Debug OMPI errors

2019-07-28 Thread Jeff Squyres (jsquyres) via users
I'm not sure exactly what you are asking -- can you be more specific? Are you asking if Open MPI can emit more detail when an error occurs and the job aborts? > On Jul 28, 2019, at 4:12 AM, Passant A. Hafez via users > wrote: > > Hello all, > > I was wondering if I can enable some

Re: [OMPI users] When is it save to free the buffer after MPI_Isend?

2019-07-28 Thread Jeff Squyres (jsquyres) via users
On Jul 27, 2019, at 10:43 PM, Gilles Gouaillardet via users wrote: > > MPI_Isend() does not automatically frees the buffer after it sends the > message. > (it simply cannot do it since the buffer might be pointing to a global > variable or to the stack). Gilles is correct: MPI_Isend does not

Re: [OMPI users] bash: orted: command not found -- ran through the FAQ already

2019-07-25 Thread Jeff Squyres (jsquyres) via users
On Jul 25, 2019, at 10:31 AM, Ewen Chan via users wrote: > > Here's my configuration: > > OS: CentOS 7.6.1810 x86_64 (it's a fresh install. I installed it last night.) > OpenMPI version: 1.10.7 (that was the version that was available in the > CentOS install repo) > path to mpirun:

Re: [OMPI users] How is the rank determined (Open MPI and Podman)

2019-07-24 Thread Jeff Squyres (jsquyres) via users
On Jul 24, 2019, at 5:16 PM, Ralph Castain via users wrote: > > It doesn't work that way, as you discovered. You need to add this information > at the same place where vader currently calls modex send, and then retrieve > it at the same place vader currently calls modex recv. Those macros

Re: [OMPI users] When is it save to free the buffer after MPI_Isend?

2019-07-22 Thread Jeff Squyres (jsquyres) via users
> On Jul 21, 2019, at 11:31 AM, carlos aguni via users > wrote: > > MPI_Isend() > ... some stuff.. > flag = 0; > MPI_Test(req, , ); > if (flag){ > free(buffer); > } > > After the free() i'm getting errors like: > [[58327,1],0][btl_tcp_frag.c:130:mca_btl_tcp_frag_send] >

Re: [OMPI users] Segmentation fault when using 31 or 32 ranks

2019-07-10 Thread Jeff Squyres (jsquyres) via users
It might be worth trying the latest v4.0.x nightly snapshot -- we just updated the internal PMIx on the v4.0.x branch: https://www.open-mpi.org/nightly/v4.0.x/ > On Jul 10, 2019, at 1:29 PM, Steven Varga via users > wrote: > > Hi i am fighting similar. Did you try to update the pmix

Re: [OMPI users] Compilation errors with SunOS and Sun CC

2019-07-09 Thread Jeff Squyres (jsquyres) via users
Thanks for the report. I do not believe that we have supported SunOS 5.x in quite some time. I'm quite sure we haven't been testing on it, so it does not surprise me that there would be errors on that platform. It may be that and older version of Open MPI supports that platform -- I'm afraid

Re: [OMPI users] growing memory use from MPI application

2019-06-20 Thread Jeff Squyres (jsquyres) via users
On Jun 20, 2019, at 1:34 PM, Noam Bernstein wrote: > > Aha - using Mellanox’s OFED packaging seems to essentially (if not 100%) > fixed the issue. There still appears to be some small leak, but it’s of > order 1 GB, not 10s of GB, and it doesn’t grow continuously. And on later > runs of

Re: [OMPI users] OpenMPI 4 and pmi2 support

2019-06-20 Thread Jeff Squyres (jsquyres) via users
rs On Behalf Of Noam Bernstein > via users > Sent: Thursday, June 20, 2019 9:16 AM > To: Jeff Squyres (jsquyres) > Cc: Noam Bernstein ; Open MPI User's List > > Subject: Re: [OMPI users] OpenMPI 4 and pmi2 support > > > > > On Jun 20, 2019, at 11:54 AM, Jeff Squ

Re: [OMPI users] Intel Compilers

2019-06-20 Thread Jeff Squyres (jsquyres) via users
Can you send the exact ./configure line you are using to configure Open MPI? > On Jun 20, 2019, at 12:32 PM, Charles A Taylor via users > wrote: > > > >> On Jun 20, 2019, at 12:10 PM, Carlson, Timothy S >> wrote: >> >> I’ve never seen that error and have built some flavor of this

Re: [OMPI users] OpenMPI 4 and pmi2 support

2019-06-20 Thread Jeff Squyres (jsquyres) via users
On Jun 14, 2019, at 2:02 PM, Noam Bernstein via users wrote: > > Hi Jeff - do you remember this issue from a couple of months ago? Noam: I'm sorry, I totally missed this email. My INBOX is a continual disaster. :-( > Unfortunately, the failure to find pmi.h is still happening. I just

Re: [OMPI users] growing memory use from MPI application

2019-06-20 Thread Jeff Squyres (jsquyres) via users
On Jun 20, 2019, at 9:31 AM, Noam Bernstein via users wrote: > > One thing that I’m wondering if anyone familiar with the internals can > explain is how you get a memory leak that isn’t freed when then program ends? > Doesn’t that suggest that it’s something lower level, like maybe a kernel

Re: [OMPI users] Packaging issue with linux spec file when not build_all_in_one_rpm due to empty grep

2019-05-14 Thread Jeff Squyres (jsquyres) via users
Daniel -- Many thanks for bringing this to our attention. I've filed a PR with the fix: https://github.com/open-mpi/ompi/pull/6659 > On Apr 16, 2019, at 2:58 PM, Daniel Letai wrote: > > In src rpm version 4.0.1 if building with --define 'build_all_in_one_rpm 0' > the grep -v _mandir

Re: [OMPI users] error running mpirun command

2019-05-14 Thread Jeff Squyres (jsquyres) via users
Looks like this thread accidentally got dropped; sorry! More below. > On May 4, 2019, at 10:40 AM, Eric F. Alemany via users > wrote: > > Hi Gilles, > > Thank you for your message and your suggestion. As you suggested i tried > mpirun -np 84 - -hostfile hostsfile --mca routed direct

Re: [OMPI users] MPI failing on Infiniband (queue pair error)

2019-05-09 Thread Jeff Squyres (jsquyres) via users
You might want to try two things: 1. Upgrade to Open MPI v4.0.1. 2. Use the UCX PML instead of the openib BTL. You may need to download/install UCX first. Then configure Open MPI: ./configure --with-ucx --without-verbs --enable-mca-no-build=btl-uct ... This will build the UCX PML, and that

Re: [OMPI users] Problems in 4.0.1 version and printf function

2019-05-09 Thread Jeff Squyres (jsquyres) via users
Stdout forwarding should continue to work in v4.0.x just like it did in v3.0.x. I.e., printf's from your app should appear in the stdout of mpirun. Sometimes they can get buffered, however, such as if you redirect the stdout to a file or to a pipe. Such shell buffering may only emit output

Re: [OMPI users] 3.0.4, 4.0.1 build failure on OSX Mojave with LLVM

2019-04-24 Thread Jeff Squyres (jsquyres) via users
include > -fPIC -pipe -DSTDC_HEADERS -DOPAL_STDC_HEADERS' > > in particular, the last two defines, fixes this. > > .........John > > > > On 4/23/2019 4:59 PM, Jeff Squyres (jsquyres) wrote: >> The version of LLVM that I have installed on my Mac 1

Re: [OMPI users] 3.0.4, 4.0.1 build failure on OSX Mojave with LLVM

2019-04-23 Thread Jeff Squyres (jsquyres) via users
The version of LLVM that I have installed on my Mac 10.14.4 is: $ where clang /usr/bin/clang $ clang --version Apple LLVM version 10.0.1 (clang-1001.0.46.4) Target: x86_64-apple-darwin18.5.0 Thread model: posix InstalledDir:

Re: [OMPI users] allow over-subscription by default

2019-04-17 Thread Jeff Squyres (jsquyres) via users
On Apr 17, 2019, at 3:38 AM, Steffen Christgau wrote: > > as written in my original post, I'm using a custom build of 4.0.0 which I'm sorry -- I missed that (it was at the bottom; my bad). > was configured with nothing more than a --prefix and > --enable-mpi-fortran. I checked for updates and

Re: [OMPI users] allow over-subscription by default

2019-04-16 Thread Jeff Squyres (jsquyres) via users
Steffen -- What version of Open MPI are you using? > On Apr 16, 2019, at 9:21 AM, Steffen Christgau > wrote: > > Hi Tim, > > it helps, up to four processes. But it has two drawbacks. 1) Using more > cores/threads than the machine provides (so the actual > over-subscription) is still not

Re: [OMPI users] relocating an installation

2019-04-10 Thread Jeff Squyres (jsquyres) via users
To be clear, --prefix and OPAL_PREFIX do two different things: 1. "mpirun --prefix" and "/abs/path/to/mpirun" (both documented in mpirun(1)) are used to set PATH and LD_LIBRARY_PATH on remote nodes when invoked via ssh/rsh. It's a way of not having to set your shell startup files to point to

Re: [OMPI users] relocating an installation

2019-04-09 Thread Jeff Squyres (jsquyres) via users
Reuti's right. Sorry about the potentially misleading use of "--prefix" -- we basically inherited that CLI option from a different MPI implementation (i.e., people asked for it). So we were locked into that meaning for the "--prefix" CLI options. > On Apr 9, 2019, at 9:14 AM, Reuti wrote:

Re: [OMPI users] Error when Building an MPI Java program

2019-04-09 Thread Jeff Squyres (jsquyres) via users
Can you provide a small, standalone example + recipe to show the problem? > On Apr 8, 2019, at 6:45 AM, Benson Muite wrote: > > Hi > > Am trying to build an MPI Java program using OpenMPI 4.0.1: > > I get the following error: > > Compilation failure >

Re: [OMPI users] OpenMPI 4 and pmi2 support

2019-03-22 Thread Jeff Squyres (jsquyres) via users
Noam -- I believe we fixed this issue after v4.0.0 was released. Can you try the v4.0.1rc3 tarball that was just released today? https://www.open-mpi.org/software/ompi/v4.0/ > On Mar 22, 2019, at 6:07 PM, Noam Bernstein via users > wrote: > > Hi - I'm trying to compile openmpi 4.0.0

Re: [OMPI users] _init function being called for every linked OpenMPI library

2019-03-22 Thread Jeff Squyres (jsquyres) via users
Yes, it's the DLL init function. It's not in our source code; it's put there automatically by the compiler/linker. > On Mar 22, 2019, at 2:12 PM, Simone Atzeni wrote: > > Hi, > > I was debugging a program compiled with `mpicxx` and noticed that when the > program is being launched the

Re: [OMPI users] Error initializing an UCX / OpenFabrics device. #6300

2019-03-22 Thread Jeff Squyres (jsquyres) via users
Greetings Charlie. Yes, it looks like you replied to a closed issue on Github -- would you mind opening a new issue about it? You can certainly refer to the old issue for context. But replying to closed issues is a bit dangerous: if we miss the initial email from GitHub (and all of us have

Re: [OMPI users] Double free in collectives

2019-03-15 Thread Jeff Squyres (jsquyres) via users
On Mar 14, 2019, at 8:46 PM, Gilles Gouaillardet wrote: > > The first location is indeed in ompi_coll_libnbc_iallreduce() What on earth was I looking at this morning? Scheesh. Thanks for setting me straight...! -- Jeff Squyres jsquy...@cisco.com

Re: [OMPI users] Best way to send on mpi c, architecture dependent data type

2019-03-15 Thread Jeff Squyres (jsquyres) via users
On Mar 15, 2019, at 2:02 PM, Sergio None wrote: > > Yes, i know that C99 have these fixed standard library. > > The point is that many common used function, rand for example, return non > fixed types. And then you need doing castings, typically to more big types. > It can be a bit annoying.

Re: [OMPI users] Double free in collectives

2019-03-14 Thread Jeff Squyres (jsquyres) via users
Lee Ann -- Thanks for your bug report. I'm not able to find a call to NBC_Schedule_request() in ompi_coll_libnbc_iallreduce(). I see 2 calls to NBC_Schedule_request() in ompi/mca/coll/libnbc/nbc_iallreduce.c, but they are in different functions. Can you clarify exactly which one(s) you're

Re: [OMPI users] [SciPy-Dev] Fwd: Announcement and thanks to Season of Docs survey respondents: Season of Docs has launched

2019-03-14 Thread Jeff Squyres (jsquyres) via users
Hmm -- yes, this could be quite interesting. We always have a need for documentation to be updated! I'm afraid that I already have direct responsibility for an intern this summer and some other "manage people" kinds of duties that mean that I will not have time to be a mentor in this program,

Re: [OMPI users] Web page update needed?

2019-03-12 Thread Jeff Squyres (jsquyres) via users
Good catch! I will go fix. Thanks for the heads up! > On Mar 11, 2019, at 3:40 PM, Bennet Fauber wrote: > > From the web page at > > https://www.open-mpi.org/nightly/ > >Before deciding which series to download, be sure to read Open > MPI's philosophy on >version numbers. The short

Re: [OMPI users] error while using open mpi to compile parallel-netcdf-1.8.1

2019-03-07 Thread Jeff Squyres (jsquyres) via users
FWIW, the error message is telling you exactly what is wrong and provides a link to the FAQ item on how to fix it. It's a bit inelegant that it's segv'ing after that, but the real issue is what is described in the help message. > On Mar 6, 2019, at 3:07 PM, Zhifeng Yang wrote: > > Hi > > I

Re: [OMPI users] Does Close_port invalidates the communicactor?

2019-03-01 Thread Jeff Squyres (jsquyres) via users
Close port should only affect new incoming connections. Your established communicator should still be fully functional. > On Mar 1, 2019, at 4:11 AM, Florian Lindner wrote: > > Hello, > > I wasn't able to find anything about that in the standard. Given this > situation: > > >

Re: [OMPI users] Building PMIx and Slurm support

2019-02-28 Thread Jeff Squyres (jsquyres) via users
On Feb 28, 2019, at 12:20 PM, Bennet Fauber wrote: > > I was pointing out why someone might think using `--with-FEATURE=/usr` is > sometimes necessary. True, but only in the case of a bug. :-) ...but in this case, the bug is too old, and we're almost certainly not going to fix it (sorry!

Re: [OMPI users] Building PMIx and Slurm support

2019-02-28 Thread Jeff Squyres (jsquyres) via users
On Feb 28, 2019, at 11:27 AM, Bennet Fauber wrote: > > 13bb410b52becbfa140f5791bd50d580 /sw/src/arcts/ompi/openmpi-1.10.7.tar.gz > bcea63d634d05c0f5a821ce75a1eb2b2 openmpi-v1.10-201705170239-5e373bf.tar.gz Bennet -- I'm sorry; I don't think we've updated the 1.10.x branch in forever. The

Re: [OMPI users] OpenMPI v4.0.0 signal 11 (Segmentation fault)

2019-02-20 Thread Jeff Squyres (jsquyres) via users
Can you try the latest 4.0.x nightly snapshot and see if the problem still occurs? https://www.open-mpi.org/nightly/v4.0.x/ > On Feb 20, 2019, at 1:40 PM, Adam LeBlanc wrote: > > I do here is the output: > > 2 total processes killed (some possibly by mpirun during cleanup) >

Re: [OMPI users] Cannot install open mpi (Mac Mojave 10.14.2 (18C54))

2019-02-07 Thread Jeff Squyres (jsquyres) via users
The same steps you list for Open MPI v2.0.2 should work for Open MPI v4.0.0. Can you send the full set of information listed here: https://www.open-mpi.org/community/help/ > On Feb 1, 2019, at 4:00 PM, Neil Teng wrote: > > Hi, > > I am following the following these steps to install the

Re: [OMPI users] Help Getting Started with Open MPI and PMIx and UCX

2019-01-18 Thread Jeff Squyres (jsquyres) via users
On Jan 18, 2019, at 12:43 PM, Matt Thompson wrote: > > With some help, I managed to build an Open MPI 4.0.0 with: We can discuss each of these params to let you know what they are. > ./configure --disable-wrapper-rpath --disable-wrapper-runpath Did you have a reason for disabling these?

Re: [OMPI users] Increasing OpenMPI RMA win attach region count.

2019-01-09 Thread Jeff Squyres (jsquyres) via users
You can set this MCA var on a site-wide basis in a file: https://www.open-mpi.org/faq/?category=tuning#setting-mca-params > On Jan 9, 2019, at 1:18 PM, Udayanga Wickramasinghe wrote: > > Thanks. Yes, I am aware of that however, I currently have a requirement to > increase the default.

Re: [OMPI users] Suggestion to add one thing to look/check for when running OpenMPI program

2019-01-09 Thread Jeff Squyres (jsquyres) via users
Good suggestion; thank you! > On Jan 8, 2019, at 9:44 PM, Ewen Chan wrote: > > To Whom It May Concern: > > Hello. I'm new here and I got here via OpenFOAM. > > In the FAQ regarding running OpenMPI programs, specifically where someone > might be able to run their OpenMPI program on a local

Re: [OMPI users] v2.1.1 How to utilise multiple NIC ports

2018-12-22 Thread Jeff Squyres (jsquyres) via users
On Dec 22, 2018, at 10:56 AM, Bob Beattie wrote: > > How do I now go about setting up /etc/hosts, -hostfile entries and bringing > them all together on the mpirun run line ? > For example, my 2nd machine is a quad core Dell T3500. Should I create a > separate entry in /etc/hosts for each NIC

Re: [OMPI users] v2.1.1 How to utilise multiple NIC ports

2018-12-20 Thread Jeff Squyres (jsquyres) via users
On Dec 20, 2018, at 3:33 PM, Bob Beattie wrote: > > I'm working on OpenFOAM v5 and have been successful in getting two nodes > working together. (both 18.04 LTS connected via GbE) > As both machines have a quad port gigabit NIC I have been trying to persuade > mpirun to use more than a single

Re: [OMPI users] It's possible to get mpi working without ssh?

2018-12-19 Thread Jeff Squyres (jsquyres) via users
On Dec 19, 2018, at 11:42 AM, Daniel Edreira wrote: > > Does anyone know if there's a possibility to configure a cluster of nodes to > communicate with each other with mpirun without using SSH? > > Someone is asking me about making a cluster with Infiniband that does not use > SSH to

Re: [OMPI users] filesystem-dependent failure building Fortran interfaces

2018-12-04 Thread Jeff Squyres (jsquyres) via users
Hi Dave; thanks for reporting. Yes, we've fixed this -- it should be included in 4.0.1. https://github.com/open-mpi/ompi/pull/6121 If you care, you can try the nightly 4.0.x snapshot tarball -- it should include this fix: https://www.open-mpi.org/nightly/v4.0.x/ > On Dec 4, 2018,

Re: [OMPI users] [Open MPI Announce] Open MPI SC'18 State of the Union BOF slides

2018-11-27 Thread Jeff Squyres (jsquyres) via users
Bert -- Sorry for the slow reply; got caught up in SC'18 and the US Thanksgiving holiday. Yes, you are exactly correct (I saw your GitHub issue/pull request about this before I saw this email). We will fix this in 4.0.1 in the very near future. > On Nov 19, 2018, at 3:10 AM, Bert Wesarg

Re: [OMPI users] One question about progression of operations in MPI

2018-11-27 Thread Jeff Squyres (jsquyres) via users
Sorry for the delay in replying; the SC'18 show and then the US Thanksgiving holiday got in the way. More below. > On Nov 16, 2018, at 10:50 PM, Weicheng Xue wrote: > > Hi Jeff, > > Thank you very much for your reply! I am now using a cluster at my > university

Re: [OMPI users] One question about progression of operations in MPI

2018-11-16 Thread Jeff Squyres (jsquyres) via users
On Nov 13, 2018, at 8:52 PM, Weicheng Xue wrote: > > I am a student whose research work includes using MPI and OpenACC to > accelerate our in-house research CFD code on multiple GPUs. I am having a big > issue related to the "progression of operations in MPI" and am thinking your > inputs

Re: [OMPI users] Need Help - Thank you for this great tool

2018-11-07 Thread Jeff Squyres (jsquyres) via users
What is the exact problem you are trying to solve? Please send all the information listed here: https://www.open-mpi.org/community/help/ > On Nov 5, 2018, at 4:44 AM, saad alosaimi wrote: > > Dear All, > > First of all, thank you for this great tool. > Actually, I try to bind rank or

Re: [OMPI users] (no subject)

2018-11-01 Thread Jeff Squyres (jsquyres) via users
That's pretty weird. I notice that you're using 3.1.0rc2. Does the same thing happen with Open MPI 3.1.3? > On Oct 31, 2018, at 9:08 PM, Dmitry N. Mikushin wrote: > > Dear all, > > ompi_info reports pml components are available: > > $ /usr/mpi/gcc/openmpi-3.1.0rc2/bin/ompi_info -a | grep

Re: [OMPI users] check for CUDA support

2018-10-30 Thread Jeff Squyres (jsquyres) via users
ote: > > +1 to what Jeff said. > > So you would need --with-cuda pointing to a cuda installation to have > cuda-awareness in OpenMPI. > > On Tue, Oct 30, 2018 at 12:47 PM Jeff Squyres (jsquyres) via users > wrote: > The "Configure command line" shows you the comm

Re: [OMPI users] check for CUDA support

2018-10-30 Thread Jeff Squyres (jsquyres) via users
The "Configure command line" shows you the command line that was given to "configure" when building Open MPI. The "MPI extensions" line just indicates which Open MPI "extensions" were built. CUDA is one of the possible extensions that can get built. The CUDA Open MPI extension is actually an

Re: [OMPI users] Wrapper Compilers

2018-10-26 Thread Jeff Squyres (jsquyres) via users
On Oct 25, 2018, at 5:30 PM, Reuti wrote: > >> The program 'mpic++' can be found in the following packages: >> * lam4-dev >> * libmpich-dev >> * libopenmpi-dev > > PS: Interesting that they still include LAM/MPI, which was superseded by Open > MPI some time ago. ZOMG. As one of the last

Re: [OMPI users] openmpi-v4.0.0rc5: ORTE_ERROR_LOG: Data unpack would read past end of buffer

2018-10-23 Thread Jeff Squyres (jsquyres) via users
Siegmar: the issue appears to be using the rank mapper. We should get that fixed, but it may not be fixed for v4.0.0. Howard opened the following GitHub issue to track it: https://github.com/open-mpi/ompi/issues/5965 > On Oct 23, 2018, at 9:29 AM, Siegmar Gross > wrote: > > Hi, > >

Re: [OMPI users] [version 2.1.5] invalid memory reference

2018-10-11 Thread Jeff Squyres (jsquyres) via users
Patrick -- You might want to update your HDF code to not use MPI_LB and MPI_UB -- these constants were deprecated in MPI-2.1 in 2009 (an equivalent function, MPI_TYPE_CREATE_RESIZED was added in MPI-2.0 in 1997), and were removed from the MPI-3.0 standard in 2012. Meaning: the death of these

Re: [OMPI users] Cannot run MPI code on multiple cores with PBS

2018-10-11 Thread Jeff Squyres (jsquyres) via users
gt;> On Oct 4, 2018, at 10:30 AM, John Hearns via users >>> wrote: >>> >>> Michele one tip: log into a compute node using ssh and as your own >>> username. >>> If you use the Modules envirnonment then load the modules you use in >>> the

Re: [OMPI users] --mca btl params

2018-10-10 Thread Jeff Squyres (jsquyres) via users
On Oct 9, 2018, at 8:55 PM, Noam Bernstein wrote: > >> That's basically what the output of >> ompi_info -a >> says. You actually probably want: ompi_info | grep btl That will show you the names and versions of the "btl" plugins that are available on your system. For example, this is what I

Re: [OMPI users] opal_pmix_base_select failed for master and 4.0.0

2018-10-05 Thread Jeff Squyres (jsquyres) via users
): >> >> opal_pmix_base_select failed >> --> Returned value Not found (-13) instead of ORTE_SUCCESS >> -- >> loki hello_1 118 >> >> >> I don't know, if you have already appli

<    1   2   3   4   5   6   7   8   9   10   >