Re: [OMPI users] Sockets half-broken in Open MPI 2.0.2?

2018-06-06 Thread Jeff Squyres (jsquyres) via users
Alexander -- I don't know offhand if 2.0.2 was faulty in this area. We usually ask users to upgrade to at least the latest release in a given series (e.g., 2.0.4) because various bug fixes are included in each sub-release. It wouldn't be much use to go through all the effort to make a proper

Re: [OMPI users] Fwd: OpenMPI 3.1.0 on aarch64

2018-06-08 Thread Jeff Squyres (jsquyres) via users
Hmm. I'm confused -- can we clarify? I just tried configuring Open MPI v3.1.0 on a RHEL 7.4 system with the RHEL hwloc RPM installed, but *not* the hwloc-devel RPM. Hence, no hwloc.h (for example). When specifying an external hwloc, configure did fail, as expected: - $ ./configure

Re: [OMPI users] Fwd: OpenMPI 3.1.0 on aarch64

2018-06-08 Thread Jeff Squyres (jsquyres) via users
On Jun 8, 2018, at 11:38 AM, Bennet Fauber wrote: > > Hmm. Maybe I had insufficient error checking in our installation process. > > Can you make and make install after the configure fails? I somehow got an > installation, despite the configure status, perhaps? If it's a fresh tarball

Re: [OMPI users] A couple of general questions

2018-06-14 Thread Jeff Squyres (jsquyres) via users
Charles -- It may have gotten lost in the middle of this thread, but the vendor-recommended way of running on InfiniBand these days is with UCX. I.e., install OpenUCX and use one of the UCX transports in Open MPI. Unless you have special requirements, you should likely give this a try and

Re: [OMPI users] A couple of general questions

2018-06-15 Thread Jeff Squyres (jsquyres) via users
On Jun 14, 2018, at 1:23 PM, Charles A Taylor wrote: > > Hmmm. ompi_info only shows the ucx pml. I don’t see any “transports”. > Will they show up somewhere or are they documented. Right now it looks like > the only UCX related thing I can do with openmpi 3.1.0 is Actually, I know that

Re: [OMPI users] A couple of general questions

2018-06-15 Thread Jeff Squyres (jsquyres) via users
> I will help with the OFI part. > > Thanks, > _MAC > > -Original Message- > From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of Jeff > Squyres (jsquyres) via users > Sent: Thursday, June 14, 2018 12:50 PM > To: Open MPI User's List > C

Re: [OMPI users] A couple of general questions

2018-06-14 Thread Jeff Squyres (jsquyres) via users
eeling more than a little ignorant these days. :) > > Thanks to all for the responses. It has been a huge help. > > Charlie > >> On Jun 14, 2018, at 1:18 PM, Jeff Squyres (jsquyres) via users >> wrote: >> >> Charles -- >> >> It may have gott

Re: [OMPI users] error building openmpi-master-201806060243-64a5baa on Linux with Sun C

2018-06-06 Thread Jeff Squyres (jsquyres) via users
Siegmar -- I asked some Fortran gurus, and they don't think that there is any restriction on having ASYNCHRONOUS and INTENT on the same line. Indeed, Open MPI's definition of MPI_ACCUMULATE seems to agree with what is in MPI-3.1. Is this a new version of a Fortran compiler that you're using,

Re: [OMPI users] Open MPI: undefined reference to pthread_atfork

2018-06-22 Thread Jeff Squyres (jsquyres) via users
You already asked this question on the devel list, and I've asked you for more information. Please don't just re-post your question over here on the user list and expect to get a different answer. Thanks! > On Jun 22, 2018, at 4:55 PM, lille stor wrote: > > Hi, > > > When compiling a

Re: [OMPI users] Disable network interface selection

2018-06-22 Thread Jeff Squyres (jsquyres) via users
On Jun 22, 2018, at 7:36 PM, carlos aguni wrote: > > I'm trying to run a code on 2 machines that has at least 2 network interfaces > in it. > So I have them as described below: > > compute01 > compute02 > ens3 > 192.168.100.104/24 > 10.0.0.227/24 > ens8 > 10.0.0.228/24 > 172.21.1.128/24 > ens9

Re: [OMPI users] [EXTERNAL] Re: OpenMPI 3.1.0 Lock Up on POWER9 w/ CUDA9.2

2018-07-02 Thread Jeff Squyres (jsquyres) via users
Simon -- You don't currently have another Open MPI installation in your PATH / LD_LIBRARY_PATH, do you? I have seen dependency library loads cause "make check" to get confused, and instead of loading the libraries from the build tree, actually load some -- but not all -- of the required

Re: [OMPI users] Question about undefined routines when using mpi_f08

2018-08-02 Thread Jeff Squyres (jsquyres) via users
On Aug 2, 2018, at 4:40 PM, Grove, John W wrote: > > I am compiling an application using openmpi 3.1.1. The application is mixed > Fortran/C/C++. I am using the intel compiler on a mac pro running OS 10.13.6. > When I try to use the mpi_f08 interface I get unresolved symbols at load > time,

Re: [OMPI users] Trouble writing code for simple 2 node client-server CheckLatency using openMPI

2018-07-30 Thread Jeff Squyres (jsquyres) via users
Do you need to check it with Java, or will any MPI application do? If any language will do, you might want to check out the OSU MPI benchmarks: http://mvapich.cse.ohio-state.edu/benchmarks/ > On Jul 27, 2018, at 10:54 AM, John Bauman wrote: > > Hello everyone, > > I just want to start

Re: [OMPI users] know which CPU has the maximum value

2018-08-10 Thread Jeff Squyres (jsquyres) via users
two cents from a pedestrian MPI user, > who thinks minloc and maxloc are great, > knows nothing about the MPI Forum protocols and activities, > but hopes the Forum pays attention to users' needs. > > Gus Correa > > PS - Jeff S.: Please, bring Diego's request to the Forum! Add

Re: [OMPI users] know which CPU has the maximum value

2018-08-10 Thread Jeff Squyres (jsquyres) via users
It is unlikely that MPI_MINLOC and MPI_MAXLOC will go away any time soon. As far as I know, Nathan hasn't advanced a proposal to kill them in MPI-4, meaning that they'll likely continue to be in MPI for at least another 10 years. :-) (And even if they did get killed in MPI-4, implementations

Re: [OMPI users] MPI group and stuck in communication

2018-08-10 Thread Jeff Squyres (jsquyres) via users
I'm not quite clear what the problem is that you're running in to -- you just said that there is "some problem with MPI_barrier". What problem, exactly, is happening with your code? Be as precise and specific as possible. It's kinda hard to tell what is happening in the code snippet below

Re: [OMPI users] know which CPU has the maximum value

2018-08-10 Thread Jeff Squyres (jsquyres) via users
o deprecate MPI_{MIN,MAX}LOC, they should start that >> discussion on https://github.com/mpi-forum/mpi-issues/issues or >> https://lists.mpi-forum.org/mailman/listinfo/mpiwg-coll. >> Jeff >> On Fri, Aug 10, 2018 at 10:27 AM, Jeff Squyres (jsquyres) via users >> ma

Re: [OMPI users] MPI group and stuck in communication

2018-08-11 Thread Jeff Squyres (jsquyres) via users
On Aug 10, 2018, at 6:27 PM, Diego Avesani wrote: > > The question is: > Is it possible to have a barrier for all CPUs despite they belong to > different group? > If the answer is yes I will go in more details. By "CPUs", I assume you mean "MPI processes", right? (i.e., not threads inside an

Re: [OMPI users] MPI group and stuck in communication

2018-08-13 Thread Jeff Squyres (jsquyres) via users
On Aug 12, 2018, at 2:18 PM, Diego Avesani wrote: > > Dear all, Dear Jeff, > I have three communicator: > > the standard one: > MPI_COMM_WORLD > > and other two: > MPI_LOCAL_COMM > MPI_MASTER_COMM > > a sort of two-level MPI. > > Suppose to have 8 threats, > I use 4 threats for run the same

Re: [OMPI users] mpirun hangs

2018-08-15 Thread Jeff Squyres (jsquyres) via users
There can be lots of reasons that this happens. Can you send all the information listed here? https://www.open-mpi.org/community/help/ > On Aug 15, 2018, at 10:55 AM, Mota, Thyago wrote: > > Hello. > > I have openmpi 2.0.4 installed on a Cent OS 7. When I try to run "mpirun" it >

Re: [OMPI users] [WARNING: UNSCANNABLE EXTRACTION FAILED]Re: mpirun hangs

2018-08-15 Thread Jeff Squyres (jsquyres) via users
config.log and the ompi_info > > Thanks. > > On Wed, Aug 15, 2018 at 11:46 AM, Jeff Squyres (jsquyres) via users > wrote: > There can be lots of reasons that this happens. Can you send all the > information listed here? > > https://www.open-mpi.org/community/he

Re: [OMPI users] Unable to open a shared object libsmartio-rdmav17.so

2018-08-24 Thread Jeff Squyres (jsquyres) via users
I'm afraid the error message you're getting is from libibverbs; it's trying to load a plugin named libsmartio-rdmav17.so. That's not part of Open MPI, sorry. That likely means that some dependency of libsmartio-rdmav17.so wasn't found, and the run-time loading of the plugin failed (vs. not

[OMPI users] lists.open-mpi.org appears to be back

2018-08-28 Thread Jeff Squyres (jsquyres) via users
The lists.open-mpi.org server went offline due to an outage at our hosting provider sometime in the evening on Aug 22 / early morning Aug 23 (US Eastern time). The list server now appears to be back online; I've seen at least a few backlogged emails finally come through. If you sent a mail in

[OMPI users] lists.open-mpi.org appears to be back

2018-08-28 Thread Jeff Squyres (jsquyres) via users
The lists.open-mpi.org server went offline due to an outage at our hosting provider sometime in the evening on Aug 22 / early morning Aug 23 (US Eastern time). As of yesterday morning (Saturday, Aug 25), the list server now appears to be back online; I've seen at least a few backlogged emails

Re: [OMPI users] MPI advantages over PBS

2018-08-28 Thread Jeff Squyres (jsquyres) via users
On Aug 22, 2018, at 11:49 AM, Diego Avesani wrote: > > I have a philosophical question. > > I am reading a lot of papers where people use Portable Batch System or job > scheduler in order to parallelize their code. > > What are the advantages in using MPI instead? It depends on the code in

Re: [OMPI users] MPI_MAXLOC problems

2018-08-28 Thread Jeff Squyres (jsquyres) via users
I think Gilles is right: remember that datatypes like MPI_2DOUBLE_PRECISION are actually 2 values. So if you want to send 1 pair of double precision values with MPI_2DOUBLE_PRECISION, then your count is actually 1. > On Aug 22, 2018, at 8:02 AM, Gilles Gouaillardet > wrote: > > Diego, > >

Re: [OMPI users] lists.open-mpi.org appears to be back

2018-08-28 Thread Jeff Squyres (jsquyres) via users
I originally sent this mail on Saturday, but it looks like lists.open-mpi.org was *not* actually back at this time. I'm finally starting to see all the backlogged messages on Tuesday, around 5pm US Eastern time. So I think lists.open-mpi.org is finally back in service. Sorry for the

Re: [OMPI users] Are MPI datatypes guaranteed to be compile-time constants?

2018-09-04 Thread Jeff Squyres (jsquyres) via users
On Sep 4, 2018, at 5:22 PM, Benjamin Brock wrote: > > Are MPI datatypes like MPI_INT and MPI_CHAR guaranteed to be compile-time > constants? No. They are guaranteed to be link-time constants. > Is this defined by the MPI standard, or in the Open MPI implementation? MPI standard:

Re: [OMPI users] openmpi-3.1.2 libgfortran conflict

2018-09-05 Thread Jeff Squyres (jsquyres) via users
Glad you figured it out. Just for some additional color: https://www.open-mpi.org/faq/?category=building#install-overwrite > On Sep 3, 2018, at 4:17 AM, Patrick Begou > wrote: > > Solved. > Strange conflict (not explained) after several compilation test of OpenMPI > with gcc7. Solved

Re: [OMPI users] 3.1.1 Bindings Change

2018-07-04 Thread Jeff Squyres (jsquyres) via users
Greetings Matt. https://github.com/open-mpi/ompi/commit/4d126c16fa82c64a9a4184bc77e967a502684f02 is the specific commit where the fixes came in. Here's a little creative grepping that shows the APIs affected (there's also some callback function signatures that were fixed, too, but they're

Re: [OMPI users] Disable network interface selection

2018-07-09 Thread Jeff Squyres (jsquyres) via users
Can you send the full verbose output with "--mca btl_base_verbose 100"? > On Jul 4, 2018, at 4:36 PM, carlos aguni wrote: > > Hi Gilles. > > Thank you for your reply! :) > I'm now using a compiled version of OpenMPI 3.0.2 and all seems to work fine > now. > Running `mpirun -n 3 -host

Re: [OMPI users] Seg fault in opal_progress

2018-07-11 Thread Jeff Squyres (jsquyres) via users
Ok, that would be great -- thanks. Recompiling Open MPI with --enable-debug will turn on several debugging/sanity checks inside Open MPI, and it will also enable debugging symbols. Hence, If you can get a failure when a debug Open MPI build, it might give you a core file that can be used to

Re: [OMPI users] Seg fault in opal_progress

2018-07-12 Thread Jeff Squyres (jsquyres) via users
Noam and I actually talked on the phone (whtt!?) and worked through this a bit more. Oddly, he can generate core files if he runs in /tmp, but not if he runs in an NFS-mounted directory (!). I haven't seen that before -- if someone knows why that would happen, I'd love to hear the

Re: [OMPI users] Seg fault in opal_progress

2018-07-11 Thread Jeff Squyres (jsquyres) via users
M, Jeff Squyres (jsquyres) via users >> wrote: >>>> >>> >>> After more extensive testing it’s clear that it still happens with 2.1.3, >>> but much less frequently. I’m going to try to get more detailed info with >>> version 3.1.1, wh

Re: [OMPI users] Seg fault in opal_progress

2018-07-12 Thread Jeff Squyres (jsquyres) via users
On Jul 12, 2018, at 10:59 AM, Noam Bernstein wrote: > >> Do you get core files? >> >> Loading up the core file in a debugger might give us more information. > > No, I don’t, despite setting "ulimit -c unlimited”. I’m not sure what’s > going on with that (or the lack of line info in the

Re: [OMPI users] Seg fault in opal_progress

2018-07-12 Thread Jeff Squyres (jsquyres) via users
Do you get core files? Loading up the core file in a debugger might give us more information. > On Jul 12, 2018, at 9:35 AM, Noam Bernstein > wrote: > > >> On Jul 12, 2018, at 8:37 AM, Noam Bernstein >> wrote: >> >> I’m going to try the 3.1.x 20180710 nightly snapshot next. > > Same

Re: [OMPI users] Seg fault in opal_progress

2018-07-12 Thread Jeff Squyres (jsquyres) via users
On Jul 12, 2018, at 11:45 AM, Noam Bernstein wrote: > >> E.g., if you "ulimit -c" in your interactive shell and see "unlimited", but >> if you "ulimit -c" in a launched job and see "0", then the job scheduler is >> doing that to your environment somewhere. > > I am using a scheduler

Re: [OMPI users] *** Error in `orted': double free or corruption (out): 0x00002aaab4001680 ***, in some node combos.

2018-09-11 Thread Jeff Squyres (jsquyres) via users
Thanks for reporting the issue. First, you can workaround the issue by using: mpirun --mca oob tcp ... This uses a different out-of-band plugin (TCP) instead of verbs unreliable datagrams. Second, I just filed a fix for our current release branches (v2.1.x, v3.0.x, and v3.1.x):

Re: [OMPI users] stdout/stderr question

2018-09-11 Thread Jeff Squyres (jsquyres) via users
Gilles: Can you submit a PR to fix these 2 places? Thanks! > On Sep 11, 2018, at 9:10 AM, emre brookes wrote: > > Gilles Gouaillardet wrote: >> It seems I got it wrong :-( > Ah, you've joined the rest of us :) >> >> Can you please give the attached patch a try ? >> > Working with a git clone

Re: [OMPI users] opal_pmix_base_select failed for master and 4.0.0

2018-10-05 Thread Jeff Squyres (jsquyres) via users
): >> >> opal_pmix_base_select failed >> --> Returned value Not found (-13) instead of ORTE_SUCCESS >> -- >> loki hello_1 118 >> >> >> I don't know, if you have already appli

Re: [OMPI users] [version 2.1.5] invalid memory reference

2018-10-11 Thread Jeff Squyres (jsquyres) via users
Patrick -- You might want to update your HDF code to not use MPI_LB and MPI_UB -- these constants were deprecated in MPI-2.1 in 2009 (an equivalent function, MPI_TYPE_CREATE_RESIZED was added in MPI-2.0 in 1997), and were removed from the MPI-3.0 standard in 2012. Meaning: the death of these

Re: [OMPI users] Cannot run MPI code on multiple cores with PBS

2018-10-11 Thread Jeff Squyres (jsquyres) via users
gt;> On Oct 4, 2018, at 10:30 AM, John Hearns via users >>> wrote: >>> >>> Michele one tip: log into a compute node using ssh and as your own >>> username. >>> If you use the Modules envirnonment then load the modules you use in >>> the

Re: [OMPI users] openmpi-v4.0.0rc5: ORTE_ERROR_LOG: Data unpack would read past end of buffer

2018-10-23 Thread Jeff Squyres (jsquyres) via users
Siegmar: the issue appears to be using the rank mapper. We should get that fixed, but it may not be fixed for v4.0.0. Howard opened the following GitHub issue to track it: https://github.com/open-mpi/ompi/issues/5965 > On Oct 23, 2018, at 9:29 AM, Siegmar Gross > wrote: > > Hi, > >

Re: [OMPI users] (no subject)

2018-11-01 Thread Jeff Squyres (jsquyres) via users
That's pretty weird. I notice that you're using 3.1.0rc2. Does the same thing happen with Open MPI 3.1.3? > On Oct 31, 2018, at 9:08 PM, Dmitry N. Mikushin wrote: > > Dear all, > > ompi_info reports pml components are available: > > $ /usr/mpi/gcc/openmpi-3.1.0rc2/bin/ompi_info -a | grep

Re: [OMPI users] check for CUDA support

2018-10-30 Thread Jeff Squyres (jsquyres) via users
ote: > > +1 to what Jeff said. > > So you would need --with-cuda pointing to a cuda installation to have > cuda-awareness in OpenMPI. > > On Tue, Oct 30, 2018 at 12:47 PM Jeff Squyres (jsquyres) via users > wrote: > The "Configure command line" shows you the comm

Re: [OMPI users] check for CUDA support

2018-10-30 Thread Jeff Squyres (jsquyres) via users
The "Configure command line" shows you the command line that was given to "configure" when building Open MPI. The "MPI extensions" line just indicates which Open MPI "extensions" were built. CUDA is one of the possible extensions that can get built. The CUDA Open MPI extension is actually an

Re: [OMPI users] Wrapper Compilers

2018-10-26 Thread Jeff Squyres (jsquyres) via users
On Oct 25, 2018, at 5:30 PM, Reuti wrote: > >> The program 'mpic++' can be found in the following packages: >> * lam4-dev >> * libmpich-dev >> * libopenmpi-dev > > PS: Interesting that they still include LAM/MPI, which was superseded by Open > MPI some time ago. ZOMG. As one of the last

Re: [OMPI users] Need Help - Thank you for this great tool

2018-11-07 Thread Jeff Squyres (jsquyres) via users
What is the exact problem you are trying to solve? Please send all the information listed here: https://www.open-mpi.org/community/help/ > On Nov 5, 2018, at 4:44 AM, saad alosaimi wrote: > > Dear All, > > First of all, thank you for this great tool. > Actually, I try to bind rank or

Re: [OMPI users] Cannot run MPI code on multiple cores with PBS

2018-10-04 Thread Jeff Squyres (jsquyres) via users
S this cluster is installed with? >> >> >> >> On Thu, 4 Oct 2018 at 00:02, Castellana Michele >> wrote: >> >> I fixed it, the correct file was in /lib64, not in /lib. >> >> Thank you for your help. >> >> On Oct 3, 2018, at 11:

Re: [OMPI users] opal_pmix_base_select failed for master and 4.0.0

2018-10-02 Thread Jeff Squyres (jsquyres) via users
(Ralph sent me Siegmar's pmix config.log, which Siegmar sent to him off-list) It looks like Siegmar passed --with-hwloc=internal. Open MPI's configure understood this and did the appropriate things. PMIX's configure didn't. I think we need to add an adjustment into the PMIx configure.m4 in

Re: [OMPI users] --mca btl params

2018-10-10 Thread Jeff Squyres (jsquyres) via users
On Oct 9, 2018, at 8:55 PM, Noam Bernstein wrote: > >> That's basically what the output of >> ompi_info -a >> says. You actually probably want: ompi_info | grep btl That will show you the names and versions of the "btl" plugins that are available on your system. For example, this is what I

Re: [OMPI users] Cannot run MPI code on multiple cores with PBS

2018-10-03 Thread Jeff Squyres (jsquyres) via users
It's probably in your Linux distro somewhere -- I'd guess you're missing a package (e.g., an RPM or a deb) out on your compute nodes...? > On Oct 3, 2018, at 4:24 PM, Castellana Michele > wrote: > > Dear Ralph, > Thank you for your reply. Do you know where I could find libcrypto.so.0.9.8 ?

Re: [OMPI users] error building openmpi-master-201809290304-73075b8 with Sun C 5.15

2018-10-01 Thread Jeff Squyres (jsquyres) via users
Siegmar -- I created a GitHub issue for this: https://github.com/open-mpi/ompi/issues/5814 Nathan posted a test program on there for you to try; can you try it and reply on the issue? Thanks. > On Oct 1, 2018, at 9:23 AM, Siegmar Gross > wrote: > > Hi, > > I've tried to install

Re: [OMPI users] [version 2.1.5] invalid memory reference

2018-09-19 Thread Jeff Squyres (jsquyres) via users
Yeah, it's a bit terrible, but we didn't reliably reproduce this problem for many months, either. :-\ As George noted, it's been ported to all the release branches but is not yet in an official release. Until an official release (4.0.0 just had an rc; it will be released soon, and 3.0.3 will

Re: [OMPI users] OpenMPI building fails on Windows Linux Subsystem(WLS).

2018-09-19 Thread Jeff Squyres (jsquyres) via users
I can't say that we've tried to build on WSL; the fact that it fails is probably not entirely unsurprising. :-( I looked at your logs, and although I see the compile failure, I don't see any reason *why* it failed. Here's the relevant fail from the tar_openmpi_fail file: - 5523 Making

Re: [OMPI users] How do I build 3.1.0 (or later) with mellanox's libraries

2018-09-19 Thread Jeff Squyres (jsquyres) via users
Alan -- Sorry for the delay. I agree with Gilles: Brian's commit had to do with "reachable" plugins in Open MPI -- they do not appear to be the problem here. >From the config.log you sent, it looks like configure aborted because you >requested UCX support (via --with-ucx) but configure wasn't

Re: [OMPI users] Difficulties when trying to download files?

2018-09-25 Thread Jeff Squyres (jsquyres) via users
Must have been some kind of temporary DNS glitch. Shrug. Next time it happens, also be sure to check https://downforeveryoneorjustme.com/download.open-mpi.org > On Sep 25, 2018, at 9:13 AM, Jorge D'Elia wrote: > > - Mensaje original - >> De: "Llolsten Kaonga" >> Para: "Jorge

Re: [OMPI users] --with-mpi-f90-size in openmpi-3.0.2

2018-09-27 Thread Jeff Squyres (jsquyres) via users
On Sep 27, 2018, at 12:16 AM, Zeinab Salah wrote: > > I have a problem in running an air quality model, maybe because of the size > of calculations, so I tried different versions of openmpi. > I want to install openmpi-3.0.2 with the option of > "--with-mpi-f90-size=medium", but this option

Re: [OMPI users] --with-mpi-f90-size in openmpi-3.0.2

2018-09-27 Thread Jeff Squyres (jsquyres) via users
On Sep 27, 2018, at 1:52 PM, Zeinab Salah wrote: > > Thank you so much for your detailed answers. > I use gfortran 4.8.3, what should I do? or what is the suitable openmpi > version for this version? If you build Open MPI v3.1.2 with gfortran 4.8.3, you will automatically get the "old"

Re: [OMPI users] One question about progression of operations in MPI

2018-11-16 Thread Jeff Squyres (jsquyres) via users
On Nov 13, 2018, at 8:52 PM, Weicheng Xue wrote: > > I am a student whose research work includes using MPI and OpenACC to > accelerate our in-house research CFD code on multiple GPUs. I am having a big > issue related to the "progression of operations in MPI" and am thinking your > inputs

[OMPI users] Open MPI SC'18 State of the Union BOF slides

2018-11-16 Thread Jeff Squyres (jsquyres) via users
Thanks to all who came to the Open MPI SotU BOF at SC'18 in Dallas, TX, USA this week! It was great talking with you all. Here are the slides that we presented: https://www.open-mpi.org/papers/sc-2018/ Please feel free to ask any followup questions on the users or devel lists. -- Jeff

Re: [OMPI users] Help Getting Started with Open MPI and PMIx and UCX

2019-01-18 Thread Jeff Squyres (jsquyres) via users
On Jan 18, 2019, at 12:43 PM, Matt Thompson wrote: > > With some help, I managed to build an Open MPI 4.0.0 with: We can discuss each of these params to let you know what they are. > ./configure --disable-wrapper-rpath --disable-wrapper-runpath Did you have a reason for disabling these?

Re: [OMPI users] v2.1.1 How to utilise multiple NIC ports

2018-12-20 Thread Jeff Squyres (jsquyres) via users
On Dec 20, 2018, at 3:33 PM, Bob Beattie wrote: > > I'm working on OpenFOAM v5 and have been successful in getting two nodes > working together. (both 18.04 LTS connected via GbE) > As both machines have a quad port gigabit NIC I have been trying to persuade > mpirun to use more than a single

Re: [OMPI users] v2.1.1 How to utilise multiple NIC ports

2018-12-22 Thread Jeff Squyres (jsquyres) via users
On Dec 22, 2018, at 10:56 AM, Bob Beattie wrote: > > How do I now go about setting up /etc/hosts, -hostfile entries and bringing > them all together on the mpirun run line ? > For example, my 2nd machine is a quad core Dell T3500. Should I create a > separate entry in /etc/hosts for each NIC

Re: [OMPI users] It's possible to get mpi working without ssh?

2018-12-19 Thread Jeff Squyres (jsquyres) via users
On Dec 19, 2018, at 11:42 AM, Daniel Edreira wrote: > > Does anyone know if there's a possibility to configure a cluster of nodes to > communicate with each other with mpirun without using SSH? > > Someone is asking me about making a cluster with Infiniband that does not use > SSH to

Re: [OMPI users] filesystem-dependent failure building Fortran interfaces

2018-12-04 Thread Jeff Squyres (jsquyres) via users
Hi Dave; thanks for reporting. Yes, we've fixed this -- it should be included in 4.0.1. https://github.com/open-mpi/ompi/pull/6121 If you care, you can try the nightly 4.0.x snapshot tarball -- it should include this fix: https://www.open-mpi.org/nightly/v4.0.x/ > On Dec 4, 2018,

Re: [OMPI users] [Open MPI Announce] Open MPI SC'18 State of the Union BOF slides

2018-11-27 Thread Jeff Squyres (jsquyres) via users
Bert -- Sorry for the slow reply; got caught up in SC'18 and the US Thanksgiving holiday. Yes, you are exactly correct (I saw your GitHub issue/pull request about this before I saw this email). We will fix this in 4.0.1 in the very near future. > On Nov 19, 2018, at 3:10 AM, Bert Wesarg

Re: [OMPI users] One question about progression of operations in MPI

2018-11-27 Thread Jeff Squyres (jsquyres) via users
Sorry for the delay in replying; the SC'18 show and then the US Thanksgiving holiday got in the way. More below. > On Nov 16, 2018, at 10:50 PM, Weicheng Xue wrote: > > Hi Jeff, > > Thank you very much for your reply! I am now using a cluster at my > university

Re: [OMPI users] Suggestion to add one thing to look/check for when running OpenMPI program

2019-01-09 Thread Jeff Squyres (jsquyres) via users
Good suggestion; thank you! > On Jan 8, 2019, at 9:44 PM, Ewen Chan wrote: > > To Whom It May Concern: > > Hello. I'm new here and I got here via OpenFOAM. > > In the FAQ regarding running OpenMPI programs, specifically where someone > might be able to run their OpenMPI program on a local

Re: [OMPI users] Increasing OpenMPI RMA win attach region count.

2019-01-09 Thread Jeff Squyres (jsquyres) via users
You can set this MCA var on a site-wide basis in a file: https://www.open-mpi.org/faq/?category=tuning#setting-mca-params > On Jan 9, 2019, at 1:18 PM, Udayanga Wickramasinghe wrote: > > Thanks. Yes, I am aware of that however, I currently have a requirement to > increase the default.

Re: [OMPI users] No network interfaces were found for out-of-band communications.

2018-09-12 Thread Jeff Squyres (jsquyres) via users
Can you send all the information listed here: https://www.open-mpi.org/community/help/ > On Sep 12, 2018, at 11:03 AM, Greg Russell wrote: > > OpenMPI-3.1.2 > > Sent from my iPhone > > On Sep 12, 2018, at 10:50 AM, Ralph H Castain wrote: > >> What OMPI version are we talking about

Re: [OMPI users] *** Error in `orted': double free or corruption (out): 0x00002aaab4001680 ***, in some node combos.

2018-09-13 Thread Jeff Squyres (jsquyres) via users
On Sep 12, 2018, at 4:54 AM, Balázs Hajgató wrote: > > Setting mca oob to tcp works. I will stick to this solution in our production > environment. Great! > I am not sure that it is relevant, but I also tried the patch on a > non-procduction OpenMPI 3.1.1, and "mpirun -host nic114,nic151

Re: [OMPI users] Cannot install open mpi (Mac Mojave 10.14.2 (18C54))

2019-02-07 Thread Jeff Squyres (jsquyres) via users
The same steps you list for Open MPI v2.0.2 should work for Open MPI v4.0.0. Can you send the full set of information listed here: https://www.open-mpi.org/community/help/ > On Feb 1, 2019, at 4:00 PM, Neil Teng wrote: > > Hi, > > I am following the following these steps to install the